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FOREWORD 


The  Stanford  Pasral  verifier  is  an  interactive  program  verification  system.  It  automates  much  of 
the  work  necessary  to  analyze  a program  for  consistency  with  its  documentation,  and  to  give  a 
rigorous  mathematical  proof  of  such  consistency  or  to  pin-point  areas  of  inconsistency.  It  has 
been  shown  to  have  applications  as  an  aid  to  programming,  and  to  have  potential  for 
development  as  a new  and  useful  tool  in  the  production  of  reliable  software. 

This  verifier  is  a prototype  system.  It  has  inadequacies  and  shortcomings.  It  is  undergoing 
continuous  improvement,  and  is  expected  to  be  used  eventually  in  conjunction  with  other  kinds  of 
program  analyzers.  The  purpose  of  this  manual  is  to  introduce  the  verifier  to  a wider  group  of 
users  for  experimentation.  We  hope  to  encourage  both  feedback  to  help  improve  this  system,  and 
the  development  of  other  program  analyzers. 

The  verifier  is  coded  in  Maclisp,  a version  of  Lisp  developed  at  M.I.T.  for  PDP-IO  computers. 
Versions  of  the  verifier  run  under  the  TOPS-20  operating  system  and  the  Stanford  WAITS 
operating  system. 


How  to  read  this  manual 

The  manual  Is  divided  into  two  parts.  Notation  based  on  the  SAIL  character  set  is  used 
throughout  because  it  is  closer  to  mathematical  usage.  The  alternate  notation  based  on  ASCII  Is 
sometimes  indicated;  the  reader  can  always  find  the  corresponding  ASCII  notation  by  refering  to 
Appendix  A. 

Part  I is  an  introduction  to  the  verifier.  It  contains  a short  survey  of  its  features  and  components, 
and  examples  of  its  use.  The  reader  who  has  completed  Part  I should  be  able  to  construct  simple 
examples  and  run  them.  He  should  also  have  gained  some  idea  of  what  the  verifier  can  do  and 
what  inadequacies  to  expect. 

Part  II  is  a manual  for  those  users  who  embark  upon  serious  experiments  with  the  verifier. 
Chapter  I lists  the  differences  between  standard  Pascal  and  the  documented  Pascal  that  the 
verifier  requires  as  input.  The  major  differences  are  the  required  documentation.  There  are  also 
some  minor  differences  In  code.  This  is  because  it  is  planned  that  the  verifier  will  accept  a more 
general  programming  language,  Pascal  Plus,  including  Modules  and  constructs  for  concurrent 
processing.  There  is  no  discussion  of  the  extended  language  in  this  manual. 

Chapter  2 describes  the  toplevel  user  commands. 

Chapter  3 is  a short  description  of  the  special  purpose  theorem  provers.  This  tells  the  user  what 
kinds  of  knowledge  are  "built  in’  and  what  he  must  describe  to  the  verifier  by  means  of  rules. 


I 


Chapter  i is  about  the  Rule  Language.  This  chapter  is  in  two  sections.  The  first  describes  the 
rule  language  and  how  to  express  mathematical  facts  as  rules;  the  second  section  gets  into  the 
intricacies  of  writing  rules  and  why  rules  written  one  way  may  lead  the  verifier  into  much  more 
efficient  proof  searches  than  if  they  are  written  another  way.  Section  I should  be  enough  for 
many  simple  examples. 

Appendix  A contains  syntax  charts  similar  to  the  charts  given  in  the  Pascal  User  Manual.  Here 
one  will  find  the  syntax  of  user  commands  for  running  the  verifier  and  the  syntax  of  input  to  the 
verifier,  i.e.,  programs,  assertions,  and  rules.  Also,  at  the  beginning  of  Appendix  A,  the  alternate 
ASCII  notation  for  mathematical  symbols  is  given.  Appendix  B is  a list  of  parser  error  messages 
with  a more  detailed  description  of  their  meaning  than  is  provided  by  the  comments  from  the 
system.  Appendix  C presents  the  axiomatic  semantics  used  by  the  verification  condition 
generator. 
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PART  I 

INTRODUCTION  TO  THE  STANFORD  PASCAL  VERIFIER 

i.  Overview 


Section  2 gives  a toplevel  overview  of  the  verifier  and  how  It  is  used.  Section  3 describes  the 
assertion  language,  the  language  in  which  the  specifications  of  a program  and  the  accompanying 
internal  inductive  assertions  must  be  written.  There  are  some  brief  remarks  about  what  kinds  of 
internal  inductive  assertions  are  required.  A full  description  of  compulsory  assertions  is  given  in 
Part  II,  Chapter  I,  and  this  information  is  also  contained  in  the  syntax  charts.  Section  4 outlines 
some  of  the  basic  constructs  of  the  rule  language.  Rules  defining  concepts  used  in  assertions  must 
be  written  in  the  rule  language.  Part  II,  Chipter  4 gives  more  details  about  rules  and  how  the 
theorem  prover  uses  them.  Section  5 gives  a number  of  examples  illustrating  the  use  of  the 
verifier.  The  first  few  are  quite  simple  and  should  be  sufficient  to  enable  the  reader  to  run  some 
simple  examples  of  his  own.  The  final  example,  on  verifying  a parser,  illustrates  formulation  of 
rules  from  mathematical  theories  and  the  use  of  the  verifier  in  debugging  and  improving 
specifications.  At  this  point  the  reader  is  In  a position  to  begin  finding  his  own  ways  to  use  the 
verifier.  The  methodology  of  using  verification  systems  is  by  no  means  fully  explored.  Further 
examples  of  verification  experiments  are  given  in  the  references  at  the  end  of  Part  II. 


2.  The  Verifier 


The  verifier  employs  the  inductive  assertion  method  due  to  Floyd  [7]  for  reasoning  about 
programs.  Floyd’s  method  was  developed  Into  a logic  of  programs  by  Hoare  [II]  and  others  [3, 
1 4].  The  verifier  constructs  its  proofs  within  this  logic  of  programs.  It  requires  as  input  a Pascal 
program  together  with  documentation  in  the  form  of  inductive  assertions  at  crucial  points  in  the 
program  and  ENTRY/EXIT  assertions  attached  to  each  procedure. 

Fig.  I shows  what  happens  when  the  programmer  gives  this  input  to  the  verifier.  The  input  goes 
first  to  a verification  condition  generator  which  gives  as  output  a set  of  purely  logical  conditions 
called  Verification  Conditions  (VCs).  There  is  a VC  for  each  path  In  the  program.  If  all  of  the 
VCs  can  be  proved,  the  program  satisfies  Its  specification.  The  next  step  is  to  try  to  prove  the 
VCs  using  various  algebraic  simplification  and  proof  methods.  Those  VCs  that  are  not  proved 
are  displayed  for  analysis  by  the  programmer.  If  a VC  is  incorrect,  this  may  reveal  a bug  in  the 
program  or  insufficient  documentation  at  some  point.  A modification  Is  made  to  the  input  and 
the  problem  Is  rerun.  If  the  unproven  VCs  are  all  correct  this  merely  Indicates  that  the  proof 
procedures  need  more  mathematical  facts  about  the  problem.  The  programmer  then  specifies 
appropriate  lemmas  as  as  rules  using  the  Rule  Language.  These  rules  are  Input  to  the  verifier 
and  the  proof  is  attempted  again.  Ideally,  the  time  for  a complete  cycle  (Fig.  1)  in  a modern 
interactive  computing  environment  should  be  on  the  order  of  a minute  for  a one  page  program. 
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2.1  VCC 

VCG  contains  a parser  and  a Verification  Condition  Generator  (VCGEN).  VCGEN  uses 
axiomatic  semantics  of  the  programming  language  to  generate  VCs.  We  chose  Pascal  because  at 
the  time  this  project  began  it  was  the  only  language  for  which  such  an  axiomatic  semantics  within 
Hoare’s  logic  of  programs  had  been  given  [13].  VCGEN  simply  takes  the  place  of  the  code 
generator  in  a compiler.  The  program  together  with  inductive  assertions  is  parsed  for  syntax  and 
type  compatibility  (see  Part  II,  Chapter  l for  details).  The  result  Is  an  Internal  tree  representation 
from  which  the  VCs  are  constructed  by  transforming  the  inductive  assertions  as  a function  of  the 
code.  The  transformations  correspond  to  axiomatic  proof  rules  defining  the  meaning  of  the 
programming  language  constructs.  The  theory  of  VCGEN  is  presented  in  [14]. 

The  important  point  is  that  If  all  of  the  VCs  can  be  proved  then  there  is  a proof  within  the  weak 
logic  of  programs  that  the  given  program  satisfies  its  ENTRY/EXIT  assertions  and  also  that  each 
subsection  of  the  program  satisfies  its  surrounding  inductive  assertions.  Such  a proof  can  be 
constructed  by  reversing  the  transformations  that  were  applied  by  VCGEN.  So,  the  VCs  are 
sufficient  conditions  for  correctness,  but  not  always  necessary  ones. 

The  truth  of  the  VCs  often  depends  on  how  completely  the  inductive  assertions  describe  sections 
of  the  code.  Asa  matter  of  practical  convenience,  the  programmer  should  not  be  forced  to  supply 
documentation  beyond  what  is  necessary  to  understand  the  program.  The  transformations 
currently  used  by  VCG  are  combinations  of  the  axiomatic  semantic  rules  of  Pascal.  The  objective 
of  such  combinations  is  to  reduce  the  number  of  situations  In  which  the  user  has  to  repeat  his 
assertions  in  trivial  and  tedious  ways  (this  was  a problem  with  earlier  verifiers).  The  basic 
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assertion  requirements  are  that  procedure  and  function  declarations  must  have  ENTRY/EXIT 
specifications,  and  loops  within  the  body  of  the  program  must  have  invariant  assertions.  It  Is  not 
necessary  to  place  assertions  at  all  GOTO  labels.  There  are  other  required  assertions  (e.g.,  for 
global  variables)  and  the  details  of  these  are  in  Part  II,  Chapter  i. 

It  is  easy  to  modify  VCG  for  other  languages  that  have  axiomatic  semantics  formalizable  within 
the  logic  of  programs.  No  other  component  of  the  verifier  depends  on  the  Input  programming 
language. 


2.2  The  theorem  prover 

The  prover  takes  a verification  condition  and  attempts  to  prove  It  correct.  If  it  succeeds,  it 
returns  TRUE-,  if  It  proves  that  the  verification  condition  Is  inconsistent,  it  returns  FALSE-,  If 
neither,  it  returns  a simplified  version  of  the  verification  condition. 

The  prover  Is  the  most  complex  component  of  the  verifier.  The  major  issue  In  its  design  is  the 
trade-off  between  generality  (l.e.,  logical  completeness)  and  its  average  response  time  to  given 
problems.  If  the  theorem  prover  is  very  general,  it  takes  too  long  to  prove  VCs  and  the  user  gives 
up  waiting.  If  it  is  too  restricted  in  its  logical  power  and  requires  to  be  told  too  many  trivial  facts 
(e  g.,  x + I i y -*  x < y)  the  user  will  quickly  become  frustrated. 

We  have  tried  to  solve  this  problem  by  separating  the  prover  Into  two  parts.  The  first  part, 
called  the  "simplifier",  contains  built-in  knowledge  about  the  most  common  data  structures  of 
programming  languages  — numbers,  arrays,  records,  list  structure,  and  simplifies  very  quickly 
expressions  involving  these  data  structures.  The  second  part  of  the  prover  1$  the  "rulehandler", 
which  uses  user-supplied  axioms  to  reason  about  data  structures  not  handled  by  the  simplifier. 
The  simplifier  is  thus  a very  efficient  but  very  specialized  prover  while  the  rulehandler  is  very 
general  and  not  necessarily  very  efficient.  How  the  two  components  coexist  is  a mystery  to  the 
authors  of  this  manual. 

As  we  shall  see  in  Part  II,  Chapter  3,  the  simplifier  includes  a decision  procedure  for  the 
quantifier-free  theory  of  rationals,  arrays,  records,  list  structure  and  uninterpreted  function  and 
predicate  symbols  under  +,  s,  store  and  select,  cons,  car  and  cdr.  The  main  pitfall  with  a built-in 
simplifier  such  as  this  is  that  it  is  in  fact  "built  in"  — its  workings  are  hidden  from  the  user. 

The  rulehandler  accepts  rules  supplied  by  the  user  to  define  the  concepts  used  in  documenting  his 
program.  These  rules  are  treated  as  defining  axioms  for  these  concepts  and  are  automatically 
used  by  the  prover  In  searching  for  a proof.  The  language  for  stating  rules  allows  the  user  to 
supply  hints  on  how  the  rule  Is  to  be  used.  This  is  one  method  of  making  the  search  for  a proof 
more  efficient  (see  Bledsoe  [2]).  It  is  possible  to  write  a set  of  mathematical  facts  as  a set  of  rules 
in  different  ways,  some  resulting  in  much  more  efficient  behavior  from  the  rulehandler  than 
others.  Also,  sufficient  mathematical  facts  for  a proof  may  be  supplied,  but,  depending  on  how 
they  are  expressed  as  rules,  the  rulehandler  may  or  may  not  succeed  In  finding  a proof.  In 
Section  4 we  briefly  summarize  the  kinds  of  rules  and  their  use.  A detailed  treatment  of  the  rule 
language  and  how  to  write  rules  is  in  Part  II,  Chapter  4. 
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3.  Tiie  Assertion  Language 


The  assertion  language  is  the  language  the  programmer  uses  to  document  his  programs.  A 
documented  program  is  a Pascal  program  containing  assertions;  assertions  are  required  at  certain 
points  in  programs,  and  are  optional  at  other  places.  An  assertion  is  a statement  of  relationships 
between  program  variables.  It  defines  properties  of  computation  states  that  must  be  true  every 
time  the  position  of  the  assertion  is  reached  during  a computation.  For  a theoretical  discussion  of 
assertions  and  the  logic  of  Floyd-Hoare  proofs,  refer  to  [11,  12,  14]. 

The  assertion  language  of  the  Stanford  verifier  permits  logical  statements  within  the  quantifier- 
free  first-order  theories  of  arithmetic,  Arrays,  Records,  and  Pointers  (i.e.,  the  standard  Pascal  data 
types).  Essentially  this  is  the  language  of  Pascal  Boolean  expressions  extended  In  the  following 
way.  ' 

- auxiliary  user-defined  predicate  and  function  symbols  are  allowed 

- priorities  of  the  standard  Pascal  operators  conform  to  mathematical 
conventions  rather  than  Pascal 

- special  data  structure  terms  have  been  introduced  (see  below). 

There  is  not  much  of  a theory  of  designing  assertion  languages  at  present.  Assertion  languages 
may  well  become  program  specification  languages  later  on.  We  have  tried  to  keep  ours  simple, 
adding  new  features  only  when  the  need  for  them  is  clear. 


3.1  Kinds  of  assertion  statements 

Different  kinds  of  assertions  are  allowed  by  the  assertion  language.  We  have  introduced  eight 
kinds  of  assertions  to  aid  stating  specifications.  Four  of  these  apply  to  the  specification  of 
procedures.  In  addition  to  ENTRY  and  EXIT  assertions  there  are  two  others: 

The  INITIAL  declaration  Is  used  to  describe  the  values  of  a parameter  before  and  after  a 
procedure  call.  If  procedure  p(x)  adds  I to  x we  cannot  simply  say  x>0  (p(x)}  x-x+l.  A 
convention  denoting  tense  is  needed.  An  INITIAL  statement  allows  naming  entry  values,  e.g., 
INITIAL  x«xO  a x>0  {£(*)}  x«xO+l. 

The  GLOBAL  declaration  permits  the  user  to  declare  global  variables  of  a procedure  as  formal 
parameters.  One  important  application  of  this  is  in  dealing  with  pointer  parameters.  If  a 
procedure  has  a parameter  of  type  TT,  it  is  often  necessary  to  declare  the  reference  das.-  (below)  of 
all  objects  of  type  T as  a global  variable.  This  permits  the  verifier  to  keep  track  of  any  side- 
effects.  1 

Other  kinds  of  assertion  statements  are  intended  for  use  to  avoid  having  to  repeat  inductive 
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assertions  unnecessarily.  The  examples  in  Section  5 show  the  use  of  some  kinds  of  assertions;  a 
complete  list  of  kinds  of  assertions  and  compulsory  assertions  is  given  In  Part  II,  Chapter  I. 

L 

3.2  Data  structure  terms 

The  axiomatic  theory  of  data  structure  terms  has  been  introduced  Into  the  assertion  language  to 
define  the  semantics  of  assignment  and  selection  operations  on  the  Pascal  structured  types 
ARRAY,  RECORD,  and  POINTER.  For  example,  a data  structure  term  of  the  form  <A,[I],E> 
denotes  the  array  obtained  from  A by  placing  E in  the  ith.  position;  <A,[I],E>[J]  denotes  the  jth. 
element  of  <A,[I],E>. 

We  have  similar  terms  denoting  assignments  to  dereferenced  pointers.  For  each  pointer  type 
declaration,  TYPE  T=tT0,  the  verifier  introduces  a reference  class,  called  #T0,  of  all  elements  of 
type  TO.  Pointers  of  type  T are  related  to  *T0  just  as  array  indices  are  related  to  arrays. 
Example:  The  reference  class  resulting  from  Xt:-E  is  denoted  by  the  term,  <#T0,cX=,E>. 

The  ordinary  first-order  assertion  language  is  extended  to  express  the  effects  of  data  structure 
operations.  The  newly  introdu;ed  functions  are  defined  axiomatically. 


3.2.1  Reference  class  identifiers 

We  introduce  new  individual  variables  called  reference  class  identifiers  into  the  assertion 
language.  They  have  the  form, 

«<identifier>  where  <identifier>  is  any  legal  Pascal  type  identifier. 

Reference  classes  are  not  types  in  Pascal  (although  the  syntax  for  bounded  reference  classes 
appears  in  the  early  version  of  the  Pascal  specification).  They  are  assertion  language  primitives 
and  behave  very  much  like  unbounded  arrays.  We  will  define  the  type  of  «T  to  be  reference  class 

of  T. 


3.2.2  Functions  and  predicates  on  data  structures 

New  function  symbols  corresponding  to  the  Pascal  selection,  assignment,  and  new  operations  on 
complex  data  type  variables  are  introduced: 

Selection;  x[y]  (array  selection),  r.f  (record  selection),  Dcqs  (pointer  selection) 
Assignment;  <x,  [y],  z>  (array  assignment),  <r,  .f,  i>  (record  assignment), 

<D,  cqa,  z>  (pointer  assignment) 

Extension:  Duq 
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The  new  terms  formed  by  compos.aon  of  these  new  functions  must  obey  the  Pascal  type 
compatibility  requirements.  Thus  x[y]  is  legal  only  If  x is  of  array  type  and  y is  of  the  correct 
index  type.  Similarly  <x,cy3,i>  is  legal  only  if  x is  of  type  reference  class  and  x,y,i  have 
compatible  types.  To  do  this,  the  new  functions  have  types.  The  type  of  x[y)  is  the  type  of 
elements  of  the  array  term  x.  The  type  of  <x,[ylz>  is  the  same  as  that  of  x.  If  the  type  of  x is 
reference  dais  of  T,  the  type  of  xcys  is  T.  The  type  of  <x,cyz>,i>  is  the  same  as  that  of  x.  The 
type  of  Dux  is  the  same  (reference  class)  type  as  D.  The  types  for  record  terms  are  defined 
analogously. 

The  definition  of  terms  in  the  assertion  language  is  extended  to  accomodate  new  terms  created  by 
the  combination  of  reference  class  identifiers  and  the  special  functions.  Assertion  language  terms 
are: 

1.  all  Pascal  variables 

2.  all  terms  obtained  from  I.  and  the  new  functions  by  function 
composition  restricted  to  compatible  types. 

The  new  terms  are  called  data  structure  terms. 

Reference  predicate:  Pointer  To(X,D)  means  X is  a pointer  to  a member  of  the  reference  class  D. 


3.2.3  Axioms  for  data  structure  terms 

The  selection  and  assignment  functions  satisfy  the  following  axioms  (all  the  free  variables  are 
universally  quantified): 

Ax  I.  Y«U  -*  <X,  [Y],  Z>(U]-Z 
Ax  2.  Y*U  -*  <X,  [Y],  Z>[U)-X[U] 

Ax  3.  <X.  .Y,  Z>.Y  - Z 

Ax  i.  <X,  .Y,  Z>.U  - X.U  where  Y and  U are  distinct  identifiers 
Ax  5.  Y-U  <X,  cYp.  Z>cUa-Z 
Ax  6.  Y*U  -*  <X,  cYp,  Z>cUp*Xc(Jp 

The  extension  function  obeys  three  axioms: 

Ax  7.  DuXuY  - DuYuX 

Ax  8.  X*Y  ■*  (DuX)cY3»DcYp 

Ax  9.  X*Y  -»  <D,cYp,Z>uX-<DuX,cYp,Z> 

Similarly,  the  predicate  Pointer.ToW,  D)  obeys  the  following  axioms: 

Ax  10.  Pointer.To(NIL,  D) 

Ax  II.  Pointer. TofX,  DuX) 

Ax  12.  Pointer_To(X,  <D,  cYp,  E>)  ■ Pointer _To(X,  D) 

Ax  13.  X*Y  -*  (Pointer_.To(Xt  DuY)  ■ Pointer_To(X,  D)) 
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A formulation  of  most  of  these  axioms  as  verifier  rules  is  given  in  4.5. 

Other  standard  lemmas  may  be  derived  from  these  axioms.  For  example,  <A,  [I],  A[I]>  - A can 
be  obtained  as  follows: 

<A,  II],  AM>  - A If  and  only  if  (Vj)  <A,  [I],  A[I)>[j]  - A[j] 

We  prove  by  cases. 

Suppose  Jel.  Then,  <A,  [I],  AtlWj]  ■ A[j]  from  Ax  2. 

Suppose  J-l  Then.  <A,  [I],  A[I]>[j]  - A[I]-A[J]  from  Ax  I. 

In  both  cast.  <A,  [I],  AtlMjJ  ■ A[J].  Therefore,  (Vj)  <A,  [I],  A[I]>[J]  - AtJ). 

These  axioms  form  a first-order  theory  of  data  structures.  The  terms  of  this  theory  represent 
finite  sequences  of  operations  on  data  structures.  The  theorems  are  logical  formulas  containing 
equalities  and  inequalities  between  data  structure  terms. 

For  example,  we  can  show  that  the  formula 

K*I  a L-J  -»  «A,  [I],  <A[I),  [J],  2»,  [K],  B>[IXL)  - 2 

is  a theorem  of  this  theory.  By  axiom  2, 

K*I  -♦  «A,  [I],  <A(H  (J),  2»,  (Kl  B>[I][L]  - <A,  [13.  <AtU  [J3,  2»(IXL]. 

Axiom  I implies  <A,  [I],  <A(I),  [J],  2»UXL)  - <A(U  [J].  2>(Ll 
and  finally  L-J  -+  <A[I],  [J],  2>(L]  - 2. 

In  order  to  express  many  complicated  properties  of  data  structures  we  need  to  introduce  auxiliary 
predicates.  For  example,  if  we  have  Pascal  type  definitions, 

type  TO  - TT; 

T - record  ...;  Next:  TO; ... 

it  may  be  necessary  to  make  assertions  about  'reachability'  between  pointers,  i.e.,  from  pointer  x 
one  can  reach  pointer  y by  performing  the  Next  operation  finitely  many  times.  We  introduce 
auxiliary  predicates  and  add  the  axioms  (D  ranges  over  terms  of  type  reference  class  of  T): 

Reach(D,  x,  y)  -df  (3j)  ReachstepfD,  x,  y,  J) 

Reachstep(D,  x,  y,  0)  -df  (x-y) 

Reachstep(D,  x,  y,  j+ 1)  -df  (3z)  Reach$tep(D,  x,  i,  J)  a Dcio.Next-y 

Axiomatizations  of  auxiliary  concepts  must  be  supplied  by  the  programmer  as  rules  (see  Section  4 
and  examples,  especially  5.7). 

The  semantics  of  Pascal  array,  record,  and  pointer  operations  can  be  defined  by  Floyd-Hoare  style 
axioms  in  terms  of  the  theory  of  data  structures.  The  actual  semantics  used  in  the  verifier  is 
given  In  Appendix  C. 
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4.  The  Rule  Language 


4.1  Backward  rules 

Backward  rule*  express  logical  implications.  G-*F,  and  are  stated,  INFER  F FROM  G.  The 
rulehandler  component  applies  these  rules  in  a depth  first  backwards  chaining  search  for  proofs. 
A rule  will  apply  to  a problem,  A-*B,  if  B is  an  Instance  of  F.  INFER  will  then  try  to  prove 
A-*C  where  G’  is  the  corresponding  instance  of  C).  The  rulehandler  does  not  attempt  to  deduce 
new  rules  from  the  given  set. 

Example:  In  5.2  we  formulate  a property  of  the  gcd  function: 

GCD4:  INFER  GCD(X, Y) -GCD (MOD (X, Y) ,Y)  FROM  Y>0» 

Again,  note  that  this  rule  will  only  be  applied  by  the  system  If  an  Instance  of  gcd(x,y)  - 
gcd{mod(x , y) , y)  occurs  as  a result  to  be  proven  during  the  proof. 


4.2  Replacement  rules 

These  express  logical  equivalences  between  atomic  formulas,  F«G,  and  equalities  between  terms, 
F-G,  and  are  stated  In  the  form:  REPLACE  F BY  G.  Whenever  an  instance  of  F occurs  in  a 
VC  the  equality  F-G  is  asserted.  (Note  that  F is  not  replaced  by  G,  rather  the  notation  "replace" 
has  historic  reasons.) 

Example:  The  following  is  used  In  5.8.5: 

CONSTANT  NULL_SEQUENCE; 

C0N4j  REPLACE  CONCAT (X,hULL_SEQUENCE)  BY  Xi 

This  rule  asserts  that  concat(x , null  .sequence)  - x.  Note,  however,  that  this  equality  only  becomes 
known  to  the  prover  if  an  Instance  of  concat(x,  null  sequence)  occurs  during  the  proof. 


4.3  Forward  rules 

Forward  rules  also  express  an  Implication  G-.F,  but  they  differ  from  backward  rules  in  the  way 
they  are  used  In  proof  searches.  These  rules  are  written:  FROM  G INFER  F.  Forward  rules  can 
be  used  to  derive  consequences  from  a set  of  known  facts. 

Example:  The  Inference  rule  given  In  4.1  can  be  rewritten  as: 

GC04F » FROM  Y>0  INFER  GCOtX, Y) -GCO (MOO (X. Y) , Y) « 
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In  this  case  the  fact  expressed  by  the  rule  would  be  known  to  the  system  as  soon  as  a term  y> 0 
becomes  true  during  a proof. 


4.4  Differences  between  rules 

Different  rules  may  express  the  same  logical  statement.  For  Instance  the  equivalence  of  two 
formulas  A and  B can  be  stated  in  at  least  the  following  three  ways: 

REPLACE  A BY  B; 

INFER  A FROM  B;  INFER  B FROM  A; 

FROM  A INFER  B;  FROM  B INFER  A; 

The  reason  for  this  is  that  rules  not  only  express  logical  facts;  they  also  contain  information  for 
the  prover  on  how  and  when  to  use  those  facts.  Part  II,  Chapter  4 explains  how  the  different 
kinds  of  rules  are  used. 

The  application  of  a rule  can  be  limited  by  the  use  of  restricting  expressions.  Suppose  we  want  to. 
express  the  fact  that  x*y>0  if  x>0  and  y>0.  We  could  write: 

FROM  X>0  a Y>0  INFER  X*Y>0; 

This  rule  might,  however,  lead  to  very  inefficient  proofs.  For  each  pair  of  terms  known  to  be 
positive,  the  fact  that  their  product  is  positive  will  be  asserted.  From  x>0  a y> 0 we  derive  not 
only  x*^>0  but  also  x*x*j>>0,  x*y*y> 0,  and  so  on.  We  can  avoid  this  by  adding  a whenever 
expression  to  the  rule: 

WHENEVER  X*Y  FROM  X>0  a Y>0  INFER  X*Y>0; 

The  restriction  X*Y  limits  the  application  of  this  rule  to  those  x and  y whose  product  appears  in 
the  formula  to  be  proved.  Again,  note  that  the  use  of  restrictions  is  explained  in  Part  II,  Chapter 


4.5  Rules  for  data  structure  terms 

The  axioms  of  the  theory  of  data  structures  were  given  in  3.2.3.  Below  we  give  a set  of  rules 
expressing  most  of  these  axioms.  The  axioms  omit  the  inequalities  between  all  pairs  of  distinct 
record  field  identifiers.  At  the  moment,  only  some  of  the  theory  is  Implemented  by  the  simplifier 
and  it  Is  up  to  the  user  to  include,  in  his  rulefile,  rules  such  as  these  to  express  any  required  data 
structure  axioms: 

ARRO:  REPLACE  <A,  HI  E>[J)  BY  CASES  I-J  -*  E;  InJ  A[J]  END; 

RECO:  REPLACE  <A.  .11,  E>.JJ  BY  CASES  II-JJ  E;  HwJJ  -»  A.JJ  END; 

PNTO:  REPLACE  <A,  cla,  E>c]o  BY  CASES  I-J  -*  E;  NJ  -♦  Acjs  END; 
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"A 

PNTI:  REPLACE  Aulcjo  WHERE  I*J  BY  Acja; 

PNT2:  REPLACE  <A,  cla,  E>uJ  WHERE  NJ  BY  <AuJ.  cl:>,  E>; 

PNT3:  WHENEVER  AuX  INFER  POINTER_TO(X,  AuXV, 

PNT4:  FROM  POINTER. TO(X,  A)  INFER  POINTER_TO(X,  <A,  cYs,  E>>, 
PNT5:  FROM  POINTER. TO(X,  A)  INFER  POINTER_TO(X,  AuY); 

PNT6:  FROM  -P01NTER_T0(X,  AuY)  INFER  -PO!NTER.TO(X,  A); 
PNT7:  FROM  TRUE  INFER  POINTER_TO(NIL,  A>, 

PNT8:  FROM  POINTERJKXX,  A)  a -POINTER.TO(Y,  A)  INFER  X*Y; 


5.  Verification  Examples 


The  paradigm  employed  in  ordinary  programming  can  roughly  be  described  as  follows:  One  starts 
out  with  some  concepts  that  describe  what  the  program  is  supposed  to  do  and  how  It  will  do  it. 
Such  concepts  may  include  arithmetical  facts,  properties  of  data  structures,  e.g.,  "array  A is  sorted", 
and  procedures,  e.g.,  exchange  the  ith.  and  jth.  elements  of  array  A . These  concepts  are  well 
enough  understood  that  they  are  used  to  guide  the  human  problem-solving  activity  that  finally 
results  in  a program.  Many  attempts  have  been  made  to  formalize  this  activity  as  ari  ordered 
sequence  of  steps,  eg.,  "requirements  -*  code  -*  documentation  -♦  testing",  or  by  a "topdown" 
method.  Despite  these  attempts,  normal  programming  activity  seems  well  described  by  the 
diagram, 


CONCEPTS 

i 

PROGRAM 

1 

COMPILED  COOE 

i 

TESTING 

In  designing  verifiable  programs  we  advocate  a completely  different  process.  Again  we  start  out 
with  concepts.  Bui  before  writing  any  code  we  develop  a formal  theory  of  the  concepts  involved. 
Often  the  concepts  are  already  axiomatized  (e.g.,  arithmetic)  and  one  can  use  well  known  formal 
theories.  In  other  cases  (e.g.,  business  applications)  the  necessary  formalisms  have  to  be  developed 
from  scratch.  Hopefully  this  will  change  as  more  and  more  programs  are  verified  and  more 
theories  for  Important  programming  concepts  become  available. 

Using  our  formal  theory  of  the  Initial  concepts  we  can  rephrase  the  original  problem  by  precisely 
stating  what  the  program  is  supposed  to  do  within  the  formal  theory.  Now  we  are  ready  to 
embark  on  writing  a program.  This  will  be  done  with  the  theory  in  mind,  and  at  any  stage  we 
may  use  documentation  by  inductive  assertions  (the  assertions  being  forrpulas  of  the  formal 
theory)  to  justify  a particular  piece  of  code.  Additionally,  some  program  statements,  e.g., 
procedures  and  loops,  must  have  formal  inductive  assertions  stating  their  behavior  - i.e.,  certain 
statements  have  a required  documentation.  This  means  In  particular  that  each  loop  has  an 
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associated  invariant.  The  final  product  will  be  a program  documented  by  precise  formal 
statements. 

In  parallel  with  writing  the  program,  the  axiomatic  theory  defining  the  programming  concepts 
must  be  expressed  In  a form  accepted  by  the  verifier,  l.e.,  as  "rules". 

Finally,  the  program  and  the  rules  are  submitted  to  the  verifier.  The  result  may  or  may  not  be  a 
proof  of  the  correctness  of  the  program.  If  not,  we  have  either  written  a wrong  program  or 
inadequate  assertions,  or  the  rules  expressing  the  theory  are  insufficient  for  the  system  to  find  a 
proof.  In  each  case  we  have  to  Improve  one  of  the  above  steps  (specification,  coding,  rules)  until  a 
proof  is  established. 

Graphically  the  verification  paradigm  for  program  development  can  be  represented  as  follows: 


r 

RULES 

IT 

VERIFICATION* 


CONCEPTS 

i 

FORMAL  THEORY 


nT  11  | 

SPECIFICATIONS 


DOCUMENTED 

PROGRAM 

i 

COMP I LEO  CODE 


5.1  First  example:  understanding  VCs 

This  is  a simple  example  in  constructing  documented  programs  and  reading  very  simple  VCs. 
We  hope  eventually  to  automate  aids  for  analyzing  VCs. 

We  begin  by  constructing  a procedure  that  multiplies  a given  value  parameter,  Y,  by  a global 
value,  N,  and  stores  the  result  In  X;  its  specifications  are: 

PROCEDURE  CONSTMULT (VAR  Xt INTEGER)  Y: INTEGER)) 

GLOBAL  (N) ) 

EXIT  X-Y*N| 

We  could  implement  this  by  repeatedly  adding  Y to  X In  a loop;  If  we  use  Z to  count  the  number 
of  times  the  addition  has  been  performed,  we  will  expect  X«Y*Z  to  be  an  invariant  of  the  loop. 
This  should  be  sufficient  internal  documentation.  Finally,  we  will  try  calling 
CONSTMULT(X.N)  to  compute  the  square  of  N. 
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parcai 

VAR  N.Z: INTEGER) 


PROCEDURE  CONSTflULT  (VAR  X:  INTEGER)  Y:  INTEGER)  ( 
GLOBAL  (N) j 
EXIT  X-Y*N; 

VAR  Z: INTEGER i 
BEGIN 


X-0!  Z«-0: 

INVARIANT  X-Y*Z 

WHILE  ZxN  00  BEGIN  X*-X+Y; 

Z-Z+l 

ENO 


END; 


EXIT  Z-N*N: 

BEGIN 

CONSTflULT (Z,N) : 
ENO. 


For  CONSTMULT  to  be  consistent  with  its  documentation,  there  are  two  VCs  that  must  be 
proved  VCs  tell  us  what  theorems  are  needed  to  prove  the  correctness  of  paths  In  the  program. 
The  expressions  in  a VC  are  substitution  instances  of  assertions  and  boolean  tests  in  the  program. 
We  can  recognize  which  paths  are  in  the  VC  by  the  values  of  loop  and  conditional  tests,  and 
assertions  appearing  in  the  VC. 


Unsimplified  Verification  Condition:  CONSTflULT  1 

0-Y#0  A 
(X  0«Y#Z  0 a 
-(Z  0*NT 


X_0-Y*N) 

This  VC  is  of  the  form: 


INVA RIANT(O.Y.O)  a (INVAR  IANT(X_O.Y.ZJ»  a -LOOPTE$T(Z_O.N)  - EXIT(X_0,Y.N) 
It  implies  the  consistency  of  two  paths: 

(i)  The  path  from  the  entry  to  the  loop  before  It  is  executed:  the  initial  values  of  X,  Y,  and  Z 
must  satisfy  the  invariant,  and  since  these  values  are,  X-Z-0,  this  requires  0-Y*0. 

(li)  The  path  from  the  loop  to  the  exit:  Since  X and  Z are  variables  of  the  loop,  their  final  values 
may  differ  from  their  initial  values,  so  VCCEN  has  given  these  Hnal  values  the  new  names  X 0 
and  Z.  0. 
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Unsimplified  Verification  Condition!  CONSTMULT  2 

<X-Y*Z  a 
2-N 

X+Y-Y*(Z+1I) 

This  VC  is  of  the  form, 

IN V A R IA NT(X,Y,Z)  a LOOPTEST(Z,N)  INVARIANT(X+Y,Y,Z+I). 

It  corresponds  to  the  path  around  the  loop,  and  implies  that  X*Y-Z  is  an  Invariant.  To  prove  It, 
the  prover  will  need  the  distributive  law  of  arithmetic,  which  may  be  expressed  by  a rule  as 
follows: 

RULEFILE (D1STR1BUTI VI TY) 

01  ST i REPLACE  A*(B+C)  BY  A*B+A*Cj 

It  should  be  emphasized  that  such  arithmetical  rules  can  sometimes  lead  the  prover  Into  deducing 
many  irrelevant  facts;  for  this  rule  to  have  the  desired  effect,  the  verifier  parameter 
SUMMATCH  must  be  turned  on  (see  Part  II,  Section  4.2.i3). 

Finally,  proof  of  the  procedure  call  depends  on  the  VC, 

Unsimpl i f ied  Verification  Condition!  MAIN  1 

<Z_0-CONSTMULT  X(Z,N,N)  a 
Z_B-N*N 

"*  Z_0-N*N). 

This  is  trivially  true,  but  it  Is  instructive  to  note  what  VCCEN  is  doing  In  constructing  the  VC. 
It  is  of  the  form, 

Z_0-FUNCTION(<inltial  values  of  all  parameters>)  a 
CONSTMULTEXIT(ZJ).N) 

-* 

EXIT(Z_0,N). 

Z 0 is  the  final  value  of  the  actual  VAR  parameter  Z (note  this  is  the  outer  Z).  This  VC  states 
that  the  result  of  the  procedure  call  (i.e.,  the  EXIT  assertion  of  CONSTMULT  Instantiated  to  the 
final  values  of  its  actual  parameters)  may  be  assumed  In  proving  the  EXIT  to  the  main  program. 
Also,  a function  Is  constructed  for  each  VAR  parameter  that  maps  the  initial  values  of  all 
parameters  (including  the  globals)  into  the  final  value  of  that  VAR  parameter;  VCGEN  appends 
the  formal  parameter  to  th^  procedure  name  to  make  a unique  function  name  (in  this  example, 
CONSTMULT_X).  Thi  -,  reflects  the  semantics  of  procedure  call  in  [131 


! 
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5.2  Concepts,  documentation  and  verification 

Asa  next  example  we  will  verify  a simple  greatest  common  divisor  (gc<0  program.  The  concept 
of  the  gcd  is  of  course  well  known  and  we  can  base  our  documentation  on  the  standard 
mathematical  properties  by  using  the  following  lemmas  for  non-negative  x and  y. 

gcd{x,  0)  - x 
gcd(x,  x)  - x 
gcd(x,  y)  - gcd(y,  x) 
gcd(nod{x,y),y ) - gcd(x,y)  if  j>0 
mod{x,y)  < x 

The  program  uses  these  properties  by  repeatedly  replacing  one  of  the  values  x or  y by  mod(x,  y). 
PASCAL 

FUNCT I ON  MOO ( I , J» I NTEGER) : JNTEGERi 
ENTRY  I £0  a J>0, 

EXIT  MOO20i 
EXTERNAL i 

FUNC TION  G(X0, Y0: IN TEGER 1 1 1 NTEGER » 

ENTRY  X0>0  a Y0>0( 

EXIT  G-GCO(X0,Y0)i 
VAR  X, Y,R: INTEGER: 

BEGIN 

X-X0:  Y-Y0: 

REPEAT  R-M00(X,Y)i 
X*-Y  i 
Y-R 

UNTIL  Y-0 

INVARIANT  GCO(X0.Y0I-GCO(X,Y)  a X>0  a Yi0, 

G«-X 

END,. 

The  invariant  for  the  REPEAT-loop  follows  immediately  from  our  basic  idea  of  replacing  one 
argument  of  gcd(x,y)  by  mod(x,y)  and  thereby  not  changing  the  value  of  gcd. 

The  next  step  towards  a verification  is  to  express  the  facts  about  gcd  mentioned  above  in  a form 
acceptable  to  the  verifier.  In  the  rule  language  these  facts  can  be  expressed  in  various  ways,  one 
can  use  forward  or  backward  rules  or  any  combination  thereof.  In  the  first  case  the  prover  would 
deduce  all  terms  equal  to  a term  x as  soon  as  it  sees  x.  Going  backwards  the  prover  would  try  to 
prove  an  equality  only  If  it  is  needed.  There  is  no  general  rule  telling  us  which  Is  better,  each 
method  has  its  own  advantages  and  disadvantages.  Let  us  specify  the  properties  of  gcd  with 
forward  rules. 
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RULEFILE (GCD) 

GCOli  REPLACE  GCD(X,0)  BY  Xi 

GCD2:  REPLACE  GCD(X.X)  BY  Xi 

GCD3i  REPLACE  GCO(X.Y)  BY  GCOCY.X) j 

GCD4:  REPLACE  GC0(X,Y)  UHERE  Y>0  BY  GCD(MOO(X,  Y) , Y) i 

Rewriting  these  rules  as  backward  rules  leads  to  the  following  rulefile,  which  is  not  sufficient  for 
the  proof  of  all  the  verification  conditions: 

RULEFILE (GCO I 

GCDli  INFER  GCD(X,0)-Xi 
GC02:  INFER  GCD(X.X)-X; 

GCD3:  INFER  GCD (K, Y) -GCO C Y. X) » 

GCOAi  INFER  GCO(X,Y)-GCD(MOD(X,  Y)  ,Y)  FROM  Y>0t 

Two  verification  conditions  require  commutativity  (rule  GCD3);  these  two  formulas  cannot  be 
proved  with  this  set  of  rules.  The  reason  is  that  the  backward  rule  GCD3  is  only  applied  if  the 
system  tries  to  prove  a formula  that  matches  the  pattern  of  the  INFER  clause.  If  we  change  the 
rule  GCD3  into 

GCD3:  INFER  GC0(X,Y)-Z  FROM  GC0(Y,X)-Z| 

we  greatly  increase  the  number  of  possible  matches;  in  fact,  using  this  modified  rule,  one  can 
verify  the  gcd  program. 


5.3  A hard  invariant 

The  following  example  demonstrates  that  finding  a suitable  invariant  is  not  always  a simple  task. 
We  want  to  emphasize,  however,  that  this  example  is  not  typical  of  problems  arising  In  practice. 
In  general  we  have  some  intuitive  idea  of  what  a loop  is  supposed  to  do  and  this  will  lead  us  to 
finding  the  right  invariant  (in  fact  we  ought  to  be  able  to  write  the  invariant  before  we  write  the 
code  for  the  loop).  In  this  example  we  find  ourselves  in  the  position  of  verifying  a rather  tricky 
program  and  finding  its  loop  invariant  requires  understanding  the  trick.  The  program  is  an 
iterative  version  of  McCarthy's  91-function  (21).  This  function  is  recursively  defined  as 

f{x)  - if  *>100  then  x-10  else  f[f{x+\  I)) 

It  can  be  shown  that  this  recursive  function  computes 

f{x)  - if  x>IOO  then  x-10  else  91 

Now  we  want  to  show  that  the  following  program  computes  the  same  function: 
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PASCAL 
LABEL  lj 

VAR  X.Y1.Y2.Z: INTEGER; 

ENTRY  TRUE i 

EXIT  (X>1 00  /v  Z-X-10)  v (X<101  a Z-91)  i 
BEGIN 

Yl«-X{  Y2«-l : 

1 1 

ASSERT  ????s 
IF  Yl>100  THEN 

IF  NOT (Y2-1)  THEN  BEGIN 

Y1-Y1-10J 
Y2*-Y2-l  j 
GOTO  1 
ENO 

ELSE  Z«-Y1-10 

ELSE  BEGIN 

Yl-Yl+U, 

Y2-Y2+1; 

GOTO  1 
ENO 

ENO. 

The  entry  and  exit  assertions  simply  state  that  the  program  computes  the  same  function  as  the 
recursive  91-function.  The  difficult  part  is  to  find  a suitable  invariant  at  ????.  The  key,  of 
course,  is  to  first  understand  the  operation  of  the  program. 


Each  time  label  I is  reached  the  program  starts  computing  f.  There  are  two  possible  cases 
depending  on  Yl.  If  the  initial  value  of  YI>I00,  the  program  terminates  immediately.  In  the 
other  case  function  f calls  itself  recursively,  i.e.,  f(f{YI+l  I)).  The  program  computes  the  inner  call 
to  f by  jumping  back  to  label  l.  But  in  addition,  it  has  to  be  recorded  that  upon  completion  of 
this  computation,  the  outer  call  has  to  be  computed.  This  is  done  by  incrementing  the  variable 
Y2;  thus  Y2  tells  us  how  many  outer  calls  remain  to  be  evaluated  whenever  we  reach  label  I. 


Suppose  at  a given  point  in  time  all  remaining  outer  calls  will  take  the  YI>I00  branch.  Then 
each  time  Yl  will  be  decreased  by  10  and  Z will  become  YI-IOwY2.  Since  in  this  case  Z has  to  be 
91,  we  propose  the  Invariant  YI-I0*Y2»9I.  But  this  turns  out  to  be  too  strong.  It  might  be  the 
case  that  all  but  one  of  the  outer  caiis  are  evaluated  and  we  arrive  at  label  I in  a situation  where 
Y2-I  and  YlclOI.  In  this  case  the  loop  will  take  the  YIslOO  branch  and  new  recursive  calls 
have  to  be  evaluated.  Thus  the  invariant  will  only  be  Y1-I0*Y2<92.  This  is  still  insufficient, 
but  the  remaining  details  are  fairly  easy  to  find.  First,  we  have  to  take  care  of  the  case  where 
X>I00,  I.e.,  the  program  terminates  immediately.  Second,  we  will  need  the  fact  that  throughout 
the  loop  Y2  is  positive,  so  we  have  to  add  the  conjunct  Y2>0.  Altogether  we  get  the  invariant: 

((X>I00  a Y2-I  a Yl-X)  v (X<I0I  a YI-I0*Y2<9?»  a Y2>0 


The  following  is  a terminal  session  showing  the  verification  of  this  program.  Note  that  the 
prover  has  to  do  some  non-trivial  reasoning  to  prove  MAIN  4.  The  boldface  characters  were 
typed  by  the  user. 
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r verify 

Hi  there,  welcome  to  the  Pascal  Verifier. 

Version  (VCG  4,  SIMP  24.)  (October  4 ...) 
Type  'HELP;'  for  help 

>read  itf91.verj 

Reading  file!  I TF91.PAS IEX, VERJ 
SYNTAX  SCAN  COMPLETE. 

PROGRAM  PARSED. 

CPU  SECONDS! 0.389 

>printvcj 

> 

Un9implified  Verification  Condition!  MAIN  1 

(0<Y2  a 
(100<X  A 
Y2-1  a 
VI -X  v 
X<101  A 
Y1-10*Y2<92)  a 
100<Y1  A 
-CY2-1) 

(100<X  A 
Y2-1-1  a 
Y1-10-X  v 
X<101  A 

(Yl-10) -10#(Y2-1)<92)  a 
0<Y2-i ) 


Unsimplified  Verification  Condition!  MAIN  2 

(100<X  A 
1-1  A 
X-X  V 
X<101  A 
X-10*l <92 ) A 
0<1 


Unsimpl I f led  Verification  Condition!  MAIN  3 

(0<Y2  a 
C 1 00<X  A 
Y2-1  a 
Yl-X  v 
X<101  A 
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Y1-10*Y2<92)  a 
100<Y1  A 
— (Y2-1) 


100<X  A 
Y1-10-X-10  v 
X<101  A 
Yl-10-91) 


Unsimpl i f ied  Ver i f icat ion  Condi t loni  MAIN  4 

(0<Y2  a 
(100<X  A 
Y2-1  a 
Yl-X  v 
X<101  A 
Y1-10*Y2<92>  a 
— ( 1 00< Y1 ) 

(100<X  A 
Y2+1-1  a 
Y1+1UX  v 
X<101  /v 

(Yl+11) -10*(Y2+1)<92)  a 
0<Y2+1 ) 

>$iniplify; 


Simplified  Verification  Conditiont  MAIN  1 
TRUE 


Simplified  Ver i f icat ion  Condi t ioni  MAIN  2 
TRUE 


Simplified  Verification  Conditiont  MAIN  3 
TRUE 


Simplified  Verification  Condition!  MAIN  4 
TRUE 
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5.4  Defining  concepts  to  document  a program 

The  next  program  we  will  verify  returns  the  maximum  value  of  an  array. 

To  formalize  the  concept  of  the  maximum  of  an  array  we  define  the  predicate  maxoflx,  a,  l,  r)  to 
be  true  if  x is  the  maximum  of  the  array  elements  a[f]  with  Islsr.  We  can  give  a formal 
definition  of  maxof  as: 


maxoflx,  a,  l,  r)  (Vi)  (lilsr  -♦  a[i]sx)  a Qj)  (Isjir  a a[j]-x) 

From  this  definition  the  following  lemmas  are  immediate: 
maxo/[a[l],a,l,l) 

maxof(x,a,l,r)  a o[r+ l]sx  ->  maxof{x,a,l,r+ 1) 
maxof[x,a,l,r ) a a[r+l]>x  -*  maxo/(a(r+l], a,/,r+l) 

These  lemmas  may  be  written  directly  as  backward  rules  without  any  changes  of  propositional 
structure  because  they  are  all  simple  implications  between  conjunctions  of  atomic  formulas.  The 
rules  below,  however,  are  weaker  than  these  lemmas.  They  are  sufficient  for  the  verification  of 
this  implementation  of  max  because  the  array  is  scanned  from  I to  N. 


The  full  Input  submitted  to  the  verifier  for  this  problem  is  given  below.  Pascal  Plus  permits 

arrays  in  inner  blocks  to  be  dimensioned  using  VAR  variables  and  this  is  the  reason  for  the 

enclosing  procedure  DUMMY.  (Note  and  V"  are  both  accepted  as  notation  for  assignment.) 

RULEFILE (MAX) 

fill  INFER  MAXOF  (A  (1J , A,  1 , 1)  i 

M2i  INFER  MAXOF (X. A, 1,1)  FROM  \i2  a AdlsX  a MAXOF (X, A, 1, 1-1)  j 

M3 1 INFER  MAXOF  (Ad),  A,  1,1)  FROM  1*2  a Ad)>X  a MAXOF  (X,  A.  1, 1 -1 ) i 

PASCAl 

VAR  N: INTEGER! 

PROCEOURE  DUMMYi 
EX I T TRUE ; 

TYPE  NARRAY-ARRAYfliNl  OF  INTEGER) 

FUNCTION  MAX(AiNARRAY) i INTEGER! 

GLOBAL  (N) i 
ENTRY  N>0! 

EXIT  MAXOF  (MAX, A, l.N) i 
VAR  TEMP, I i INTEGER! 

BEGIN 

TEMPi-Adlj 
FOR  I : =2  to  N 

INVARIANT  MAXOF (TEMP. A, 1,1-1) 

DO 

IF (A d I >TEMP)  THEN  TEMPi-AIIJj 
MAX) -TEMP 
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ENOi 

BEGIN  END) . 

It  is  instructive  to  look  at  the  unsimplified  verification  conditions.  At  this  stage  the  properties  of 
maxof  declared  in  the  rulefile  have  not  been  applied. 


Uneimplified  Ver i f i cat  ion  Condi t ioni  MAX  1 

(0<N  a 
2sN 

MAXOF (Atll, A, 1,2-1)  a 
(MAXOF  (TEMP_0,  A,  1 , (N+U-l) 

-♦ 

MAXOF (TEMP_0, A, 1 ,N) ) ) 


Unsimplified  Verification  Conditioni  MAX  2 

(0<N  a 
N<2 

MAXOF (All), A, 1,N)) 


Unsimplified  Verification  Conditioni  MAX  3 

( I <N  A 
2s  I A 

MAXOF (TEMP, A, 1,1-1)  a 
TEMP<A  (I ) 

MAXOF  (All),  A,  1,  (I+D-l)) 


Unsimplified  Verification  Conditioni  MAX  4 

U<N  a 
2s  1 A 

MAXOF (TEMP, A, 1,1-1)  a 
-(TEMP<Atl) ) 

MAXOF  (TEMP,  A,  1,  (I+D-l)) 

The  verifier  partitions  the  paths  of  a program  in  a particular  way  and  each  VC  corresponds  to 
one  of  these  paths.  MAX  I corresponds  to  the  path  ENTRY  ■+  enter  FOR-loop  a exit  FOR-loop 
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-*  EXIT;  TEMP_0  is  the  final  value  of  TEMP  on  leaving  the  loop.  MAX  2 corresponds  to  the 
path  ENTRY  -♦  bypass  FOR-loop  -♦  EXIT.  MAX  3 and  MAX  4 correspond  to  the  two  different 
paths  around  the  loop. 

In  practice,  the  Initial  rulefile  is  usually  inadequate  for  the  proof  of  all  VCs.  In  this  case 
inspection  of  the  unproven  (but  simplified)  VCs  will  often  suggest  new  rules  or  modifications. 
These  are  then  added  to  the  rulefile  and  run  In  the  verifier.  This  procedure  is  then  repeated 
until  all  VCs  are  proved. 


5.S  Specifications  for  sorting 

This  Bubble  sort  example  is  documented  by  standard  sorting  concepts.  Each  concept  has  a simple 
first-order  definition  (except  permutation,  see  [12,  26]).  For  example, 

ORDERED(A,  L,  R)  means  array  A is  ordered  in  the  range  [L,  RJ 

ordered(a  l,  r)  (VI)  (/si  a I <r  -*  A[lhA[l+ 1 ]). 

PARTITIONS,  L,  I,  R)  means  that  each  element  of  A in  [L,  I]  is  smaller 
than  each  element  of  A in  [I+l,  RJ 

partilion(a,l,  I,  r)  (V/,  k)  (lzj<i  a isiksr  -+  /4[/]s<4[*]). 

Rules  defining  sorting  concepts,  including  permutation,  are  given  In  [5],  The  rules  state  not  only 
standard  axioms  satisfied  by  the  concepts,  e.g.,  transitivity  of  permutation,  but  also  how  the 
concepts  are  related  when  operations  are  performed  on  arrays.  Here  is  an  example  from  [5]: 

ORDGai  INFER  ORDERED (<A,  IP],X>,L,R)  FROM  ORDERED (A, L,R)  a L<P  a P<R  a XsAIP+1] 
a XsAIP-lh 

Rule  ORD6a  states  conditions  under  which  the  array  obtained  from  A by  placing  X In  A[P]  is 
ordered. 

The  rules  can  be  shown  correct  by  proving  them  from  the  first-order  definitions.  The  sorting 
concepts  may  be  used  to  document  many  different  sorting  algorithms,  and  the  same  defining  set  of 
rules  can  be  used  for  verification  [5]  (rules  for  the  theory  of  data  structures  are  also  needed). 

' 
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PASCAL 

VAR  N: INTEGER* 

PROCEDURE  DUMMY i 
EXIT  TRUEs 

TYPE  NARRAY-ARRAYCliN]  OF  INTEGER! 

PROCEDURE  SORT (VAR  A iNARRAY) j 
GLOBAL  (N)i 
INITIAL  A-A0j 
ENTRY  N>1 ; 

EXIT  PERMUTATION (A. A0)  a ORDERED (A, 1 ,N)  i 
VAR  I, J, TEMP! INTEGER; 

BEGIN 

Ii-ll  J: -1 ; 

FOR  I : -1  TO  N-l 

INVARIANT  PERMUTATIONIA.A0)  a 0RDERE0(A,N-I+2,N)  a 
PARTITION (A, l,N-l+l,N) 

DO 

FOR  Ji-1  TO  N-I 

INVARIANT  PERMUTATION (A, A0)  a 0RDERED(A,N-I+2,N)  a 

ISBIGGER (A [J] .A.l. J-l)  a PARTITION(A,l,N-I+l,N) 
DO 

IF  A ( J] >A IJ+1)  THEN  BEGIN 

TEMPi-AlJIj 
AIJIi-A (J+lIt 
A CJ+1I i -TEMP 
END 

ENOi 

BEGIN  ENDs. 


5.6  A pointer  example 

The  procedure  below  has  a side-effect.  It  changes  the  contents  of  the  cell  referenced  by  its  X 
parameter  by  manipulating  Y.  The  problem  is  to  verify  this.  The  type  declaration,  PNTR, 
introduces  the  reference  class  »CELL  of  all  cells  referenced  by  pointers  of  type  PNTR.  «CELL  is 
a variable  of  the  computation  of  SIDEFFECT  although  it  cannot  be  mentioned  in  the  code.  It 
must  therefore  be  declared  as  a GLOBAL  parameter  of  SIDEFFECT,  and  indeed  as  a VARiable 
GLOBAL. 
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PASCAL 

TYPE  PNTR  - tCELLi 

CELL  - RECORD  CARi INTEGER  EN0» 

PROCEDURE  SIOEFFECT (VAR  YtPNTRi  KiPNTR) j 
GLOBAL  (VAR  #CELLh 
ENTRY  Xt.CAR  - 1| 

EXIT  Xt.CAR  - 2j 
BEGIN 
Yi-X; 

Yt.CARj -2 
ENDi. 


The  single  verification  condition  for  procedure  SIDEFFECT  is 

(0CELLcX:>.  CAR-1  a 
POINTER  TO(Y.tfCELL)  a 
P01NTER_T0(X,#CELL)  a 
0CELL_0-<0CELL, 
cXd, 

<tfCELLcXo,  .CAR.2» 

»CELL_0cX3. CAR-2) 

The  identifier  «CELL_0  refers  to  the  reference  class  after  the  operation  YT.CAR:-2  which 
changes  one  of  the  cells  In  «CELL  (namely  the  one  pointed  to  by  Y).  So  the  relationship  between 
them  Is 


•CELL_0  - <«CELL,  c\s,  <«CELUY3,  .CAR,  2». 

The  assignment  of  the  value  of  X to  Y makes  this  equivalent  to  the  form  that  appears  In  the  VC. 
The  VC  is  proved  using  rules  for  reference  classes  given  In  Section  4.5. 


5.7  Verification  of  Pascal  list  structure  operations 

List  structures  are  usually  implemented  in  Pascal  by  means  of  pointers  and  records.  Verification 
of  programs  that  operate  on  lists  requires  introducing  higher  level  concepts  analogously  to  the 
sorting  concepts  for  sorting  operations  on  arrays.  List  operations  are  defined  in  terms  of 
operations  on  reference  classes. 

The  procedure  INSERT  in  the  example  below  inserts  a new  word  Into  a loopfree  list.  To  prove 
that  INSERT  preserves  loopfreeness  we  use  the  Reach  concept  Introduced  in  3.2.3.  The  predicate 
Reach(D,x,y)  is  true  If  by  refering  to  the  NEXT  field  repeatedly,  starting  at  x,  one  can  reach  y; 
i.e.,  the  sequence,  x,  Dcxa.Next,  DcDcx^.Nexta.Next,  ...  in  the  reference  class  D contains  the 
pointer  y.  This  Implies  that  there  are  no  loops  between  x and  y. 
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PASCAL 

TYPE  REF  - tUOROi 

WORD  - RECORD  COUNT i INTEGER)  NEXTtREF  ENOt 

PROCEDURE  i NSERT (ROOT , Y, SENT  I NEL : REF ) j 
GLOBAL  (VAR  MJORDIs 

ENTRY  REACH  (WORD,  ROOT,  Y)  a REACH  ( MJORD , Y , SENT  I NEL ) a 
Y-SENTINEL  a YxNIL; 

EXIT  REACH (MJORO, ROOT, SENTINEL); 

VAR  ZiREFs 
BEGIN 
NEU(Z) s 

Zt.NEXT*-Yt.NEXT| 

Yt.NEXW 

ENOj. 

The  entry  assertion  implies  that  the  list  from  ROOT  to  SENTINEL  is  loopfrte  and  Y is  a pointer 
to  a word  In  the  list.  The  procedure  inserts  a new  member  of  the  list  between  Y and  Its  successor. 
The  exit  assertion  implies  that  the  result  is  still  loopfree.  This  property  of-  INSERT  is  easily 
verified  using  the  rules  for  data  structures  and  some  rules  defining  Reach. 


Here  are  three  examples  of  rules  defining  Reach: 

Rli  INFER  REACH (0,X,Y)  FROM  REACH(D,X,Z)  a REACH(D,Z, Y) i 
R2:  REPLACE  REACH (<D,cXo. COUNT, E>,Y,ZI  BY  REACH (0, Y,Z) j 
R3:  INFER  REACH(<0,cYd.NEXT,Z>,X.U) 

FROM  REACH (D,X,Y)  a REACH(D,Z,U)  a -INBETUEEN(D, Y.Z.U) j 

Rule  R I Is  implied  by  the  transitivity  of  Reach.  Rule  R2  states  that  operations  on  the  COUNT 
field,  i.e.,  XT.COUNT  ♦-  E,  preserve  loopfreeness.  Finally,  rule  R3  states  some  conditions  under 
which  the  assignment,  Yt.NEXT  ♦-  Z,  preserves  loopfreeness  between  X and  W.  We  can  justify 
the  rules  by  proving  them  from  the  recursive  definition  of  Reach  given  in  3.2.3.  It  is  a 
challenging  exercise  to  construct  axiomatizations  of  Reach  that  are  complete  in  the  sense  that  all 
satisfying  interpretations  are  isomorphic  to  linear  lists. 

Finally,  suppose  we  reverse  the  last  two  statements  of  INSERT: 

BEGIN 

NEU(Z); 

Yf.NEXT-Zs 

Zt.NEXT-Yt.NEXT 

ENOj. 

The  result  of  the  attempted  verification  is: 
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Simplified  Verification  Condition!  INSERT  1 

(REACH  (WORD, ROOT.  Y)  a 
REACH ( 0UORO , Y , SENT I NEL ) a 
Y-SENTINEL  a 
Y-N1L  a 

PO I NTER_T0 (ROOT , 0UORD ) a 
POINTER  TO(Y.mJORD)  a 
POl NTER_T0 (SENT  I NEL , 0UORO)  a 
POINTER  TOIZ./WORO)  a 
-POINTER  TO(Z_0,0UORO)  a 
WJORO_l-</WJOROUZ_0, 
cYd, 

<0UORDcYd,  .NEXT.Z  0»  a 
WORD  0-<«UORD_l, 
cZ  03. 

<«4ORD_1cZ_0d,  .NEXT,Z_0» 

* RE ACH(0UORD_0, ROOT, SENTINEL) ) 

The  identifier  Z_0  represents  the  new  value  of  Z;  *WORD_0,  and  *WORD_l  are  reference  classes 
resulting  from  operations  performed  by  INSERT.  The  conclusion  of  the  VC  Is  that  «WORD_0  is 
loepfree  between  ROOT  and  SENTINEL.  But  if  we  look  at  the  expression  for  *WORD_0  in  the 
premise  (this  expression  results  from  simplifications  obtained  from  applying  the  data  structure 
rules)  we  see  that  the  NEXT  field  of  Z_0  is  Z_0,  clearly  a loop.  As  the  expression  for  *WORD_l 
shows  that  the  NEXT  field  of  Y is  pointing  to  Z_0,  so  this  loop  is  between  ROOT  and 
SENTINEL,  the  desired  result  is  false. 


5.8  A larger  example 

We  now  present  a verification  of  a simple  parser.  Here  we  have  available  the  well  developed 
theory  of  context  free  grammars  to  assist  us  in  documenting  the  parser.  This  theory  provides  us 
with  the  necessary  concepts.  Using  user  defined  predicates  and  rules,  these  concepts  can  then  be 
defined  for  use  In  the  verification. 


5.8.1  Theory 

We  will  briefly  review  the  theory  underlying  the  proof.  A context  free  grammar  is  a tuple 
<T,  NT,  P,  {s}>  where  T and  NT  are  the  sets  of  terminal  and  nonterminal  symbols,  respectively. 

The  character  s is  a distinguished  start  symbol  in  NT  and  P is  a relation  over  NT  x (TuNT)*. 
The  sets  T,  NT,  and  P are  all  finite.  Whenever  </,  r>  is  in  P,  then  r is  of  finite  length. 

The  relation  "•>"  is  defined  over  (TuNT)*  x (TuNT)P  as  follows: 

<u.t.v,  u.w.v>  t ■>  iff  <t,  w>  t P. 
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We  use  periods  to  denote  the  concatenation  of  sequences  over  (TuNT)*.  The  relation  is 

defined  to  be  the  reflexive  and  transitive  closure  of  ->. 

The  goal  of  the  parser  is  to  determine  whether  or  not  a given  sequence  over  T*  is  in  the 

language  generated  by  the  grammar;  l.e.,  whether  s ->*  u for  the  input  u.  To  express  the  theory 
in  our  assertion  language  we  introduce  the  following  two  predicates: 

isprod(t,  ui)  Iff  <t,  w>  c P 
IsderitKx,  v)  iff  x c NT  a <x,  v>  * «>* 

From  the  definition  of  »>*  one  immediately  gets  two  lemmas 
isderiv(,x , x) 

(isderlv(x,  u.t.v ) a isprod(t,  w))  = isderiv(x,  u.w.v) 


5.8.2  The  parsing  algorithm 

The  parsing  algorithm  Is  standard  (see  [I],  p 1 77);  we  use  a stack  automaton  and  generate  a top 
down  leftmost  derivation  of  the  input  string.  More  precisely,  we  start  with  a stack  containing  the 
start  symbol  s.  Then  we  repeatedly  take  the  top  element  t from  the  stack  and  If  It  is  a 
nonterminal  symbol  we  push  atvon  the  stack  such  that  isprod(t,  tv).  Otherwise,  if  t is  a terminal 
symbol  and  It  conforms  with  the  first  symbol  In  the  Input,  we  skip  this  first  symbol.  If  none  of 
these  cases  applies  we  report  an  error. 


5.8.3  Implementation 

First  we  decide  upon  the  representation  of  the  sets  T , NT,  and  P in  our  program.  The  set  T 
will  be  an  enumerated  type  called  token  and  the  set  NT  will  also  be  as  type  nonterm.  We 
introduce  a special  type  for  TvNT\  this  will  be  a record  called  Item.  Note  that  this  could  well  be  a 
variant  record;  our  system  does  not  support  variant  records  as  such  but  does  provide  union  types. 

Sequences,  that  is  elements  from  T*  and  (TuNT)*,  are  represented  as  files;  l.e.,  T*  corresponds  to 

token  sequence , a file  of  token  and  (7WO*  corresponds  to  t_nl_sequence  which  is  a file  of  item. 
Note  that  for  an  actual  Implementation  we  would  have  to  change  tjnt_sequence  to  some  type  that 
can  be  represented  in  memory  (e.g.,  linked  lists).  However,  for  the  presentation  of  this  example  we 
will  use  files;  a change  in  the  data  structure  would  not  affect  the  overall  structure  of  the 
verification. 
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The  representation  of  P is  left  undefined  at  this  point;  we  assume  the  existence  of  an  external 
procedure  isrhs  which  given  t will  return  a tv  such  that  isprod(t,w)  holds.  The  decision  of  what 
production  is  to  be  applied  next  is  hidden  inside  isrhs  and  not  specified  further.  To  allow  a 
reasonable  implementation  of  isr/is  we  pass  as  an  additional  parameter  the  next  character  of  the 
input,  as  lookahead.  Thus  our  parser  can  deterministically  recognize  any  LL(t)  grammar. 

Three  external  procedures  empty,  push  and  top  implement  a stack  of  item.  Note,  that  push 
pushes  a whole  sequence  on  the  stack  rather  than  a single  element. 

An  external  procedure  error  is  used  to  issue  error  messages. 

We  distinguish  between  single  elements  of  {Tuh/T)  and  the  sequence  of  length  one  of  (TuNT)*; 
the  function  make. sequence  takes  an  x « ( T\)NT ) and  converts  it  Into  <x>  * (TuND*. 


5.8.4  Specifications 

As  might  be  obvious  by  now,  we  cannot  prove  that  the  parser  will  accept  every  legal  input  string, 
because  we  have  not  made  strong  enough  assumptions  about  Isrhs. 

Instead  we  will  prove  the  following  statement:  if  the  parser  terminates  and  does  not  issue  an  error 
message,  then  the  input  string  is  in  the  language  generated  by  the  grammar. 

This  might  seem  to  be  a very  weak  statement;  it  is,  however,  a good  illustration  to  demonstrate  the 
difference  between  robustness,  reliability,  and  correctness.  With  a suitable  implementation  of  isrhs 
the  parser  will  reliably  parse  any  legal  Input  string;  an  implementation  of  the  procedure  error  can 
guarantee  a reasonable  recovery  from  syntax  errors,  thus  making  the  program  robust.  In  the  case 
where  the  parser  terminates  without  an  error  message,  the  program  prc'f  will  guarantee  a correct 
parsing  of  the  input  regardless  of  the  actual  implementations  of  error  and  isrhs. 

In  writing  the  assertions  for  this  program  we  use  the  following  functions: 

imbed  maps  a sequence  over  T*  into  a sequence  over  (TuA/T)* 

concat  concatenates  two  sequences 

append  appends  a single  element  to  a sequence 

con  I places  a single  element  in  front  of  a sequence. 

The  invariant  of  the  mam  loop  states  that  the  input  read  so  far  concatenated  with  the  contents  of 
the  stack  is  derivable  from  the  start  symbol.  There  is  no  magic  in  finding  this  invariant;  it 
corresponds  closely  to  the  induction  hypotheses  of  the  formal  proof  that  each  context  free 
grammar  is  accepted  by  a non-deterministic  push  down  automaton  [l],(pl77). 

To  be  able  to  formulate  the  invariant  we  include  a virtual  variable  source  .read  which  at  any 
point  contains  the  portion  of  the  input  read  so  far. 
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5.8.5  Rules 


The  rules  necessary  for  the  proof  of  the  parser  can  be  divided  into  two  parts.  In  the  first  part,  we 
have  rules  describing  the  properties  of  isprod  and  isderiv.  Furthermore  we  have  to  specify 
properties  of  the  auxiliary  functions  used,  i.e.,  append,  concat,  con  I,  Imbed,  make  sequence. 

The  rules  ISDI  and  ISD2  formulate  the  two  lemmas  for  isderiv  mentioned  above.  The  rules 
IMBI  through  IMB6  express  that  imbed  distributes  over  makesequence,  con  I etc.  IMB7  and 
IMB8  define  imbed  for  a single  element,  i.e.,  this  is  mapped  into  one  component  of  the  record.  In 
the  second  part,  we  give  rules  that  express  trivial  facts  about  sequences. 

The  final  rulefile  is: 


RULEFILE (PARSER) 

CONSTANT  NULL_SEQUENCE , I NF01 ? 

ISDI:  INFER  I S0ERIV(X, MAKE.SEQUENCE (X) ) : 

ISD2:  INFER  I SDERI V (X, CONCAT (Z, CONCAT (R, T) ) ) FROM 
I SDER ' V (X , QONCAT  t APPEND (Z ,L) , T ) ) a 
ISPROD(L.R) i 

IMBI:  REPLACE  IMBED  (MAKE  .SEQUENCE  (XU  BY  MAKE  SEQUENCE  (IMBED  (X) ) ? 
1MB2:  REPLACE  IMBED  (CONI (X, Y) ) BY  CONI (IMBED(X) , IMBED(Y) ) | 

I MB3:  REPLACE  CONCAT (IMBED (X) , IMBEO(Y) ) BY  IMBED (CONCAT  (X,  Y) ) ? 
IMB4:  REPLACE  IMBEO(CONCAT(X,YI)  BY  CONCAT (IMBED (X) , IMBED (Y) ) ; 
IMBS:  REPLACE  APPEND (IMBED (XI , IMBED(Y) ) BY  IMBED (APPEND (X,  Y) ) : 

I MBS:  REPLACE  IMBED (APPEND (X. Y) I BY  APPEND ( I MBEO(X) , IMBED  (Y)  I j 
IMB7:  WHENEVER  IMBED(X)  FROM  TRUE  INFER  X-IMBEO(X) . INF01 ; 

IMB8:  WHENEVER  X. INF01  FROM  TRUE  INFER  X- IMBED (X. INF01) : 


NS1 : WHENEVER  EMPTY (X)  FROM  EMPTY (X)  INFER  X-NULL_SEQUENCE » 

NS2:  FROM  TRUE  INFER  IMBED (NULL  SEQUENCE  I -NULL .SEQUENCE: 

NS1A:  WHENEVER  EMPTY (X)  FROM  -EMPTY(X)  INFER  X-NULL  SEQUENCE: 

MSI:  REPLACE  CONCAT (MAKE.SEQUENCE (X) , Y)  BY  CONl(X.Y): 

APPI:  REPLACE  APPEND (NULLlSEQUENCE.X)  BY  MAKE.SEQUENCE (X I : 

FRls  WHENEVER  FIRST (X)  FROM  TRUE  INFER  CONI (FIRST (X) .REST (X) ) -X: 
CONI:  REPLACE  CONCAT (X, CONI (U.V) ) BY  CONCAT (APPEND (X.U) .V)  i 
C0N2:  REPLACE  CONCAT (APPENO (X, U) .V)  BY  CONCAT (X, CONI (U.V) | , 

C0N3:  REPLACE  CONI (X.NULL.SEQUENCE)  BY  MAKE.SEQUENCE (X I j 
C0N4:  REPLACE  CONCAT (X.NULL.SEQUENCE)  BY  Xt 
C0N5:  REPLACE  CONCAT (NULL.SEQUENCE.K)  BY  X; 

EOF:  REPLACE  EOF(X)  BY  EMPTY (X)» 


We  start  out  by  attempting  to  verify  the  following  version  of  the  parser: 
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PASCAL 

TYPE 

TOKEN  - (EMPT, 

I DENT, 

NUMBER, 

PLUS  SYMBOL, 

AN0_MANY_M0RE) t 

NONTERM  - (START_SYMBOL,  AND_SOME._MORE) i 

TOKEN_SEQUENCE  - FILE  OF  TOKEN* 

TERM_OR_NOT  - (NONTERMINAL,  TERMINAL) i 

ITEM  . RECORD 

KINO*  TERM_OR_NOT  j 
INFOlt  TOKEN* 

INF02i  NONTERM 
END* 

T_NT_SEQUENCE  - FILE  OF  ITEM* 

VAR 

SOURCE,  SOURCE.REAO  i TOKEN_SEQUENCE i 
R,  STACK  i T_NT  SEQUENCE* 

STRT,  T i ITEM7 

LOOK  : TOKEN* 

DONE  * BOOLEAN* 


PROCEDURE  ERROR* 

ENTRY  TRUE* 

EXIT  ERROR_MSG ( 1 ) * , 

EXTERNAL* 

PROCEDURE  ISRHS(VAR  Ri  T_NT  SEQUENCE*  Tt  ITEM*  Li  TOKEN)* 

ENTRY  TRUE: 

EXIT  ISPROO(T.R) * 

EXTERNAL* 

X Procedures  implementing  a stack  X 
FUNCTION  EMPTY (ST*  T NT  SEQUENCE)*  BOOLEAN* 

ENTRY  TRUE* 

EXIT  TRUE* 

EXTERNAL* 

X Return  the  top  of  the  parsing  stack,  pop  this  element  X * 

PROCEOURE  TOP (VAR  X*  ITEM)* 

GLOBAL  (STACK)* 

I N : T I AL  STACK -S0* 

ENTRY  TRUE* 

EXIT  (-EMPTY (S0)  •»  S0-CON1  (X, STACK ) ) n (EMPTY (S0)  ERROR  MSG(1))» 

EXTERNAL* 
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X Push  x on  the  parsing  stack:  note  that  we  push  whole  sequences 
rather  than  single  elements.  X 
PROCEDURE  PUSH1X:  T_NT_SEQUENCE) j 
GLOBAL  (STACK) : 

INITIAL  STACK-S0: 

ENTRY  TRUE; 

EXIT  STACK-CONCAT (X.S0) : 

EXTERNAL: 


X This  function  converts  an  element  into  a sequence  of  one  element.  X 
FUNCTION  MAKE_SEQUENCE (X:  ITEM):  T NT  SEQUENCE: 

ENTRY  TRUE:  “ 

EXIT  TRUE: 

EXTERNAL: 


X Main  program  X 
INITIAL  SOURCE -SOURCE0: 

ENTRY  -ERROR  MSG(l)  n EfIPTY (STACK)  a 

EMPTY (SOURCE_READ)  a -EMPTY (SOURCE) ; 

EXIT  -ERRORJISG(l)  * I SOER I V (STRT , I MBEO (5OURCE0) ) * 

BEGIN 

STRT. KIND-NONTERMINAL:  STRT. INF02-START  SYMBOL: 

PUSH (MAKE  SEQUENCE (STRT)): 

READ (SOURCE. LOOK): 

INVARIANT  -ERRORJISG(l)  -♦ 

( SOURCE0-CONCA  T (SOURCE  READ , CONI (LOOK , SOURCE ) ) a 
I SOER I V (STRT. CONCAT (IMBEO (SOURCE  READ) , STACK) ) ) 

WHJLE  NOT  EOF (SOURCE)  DO 
BEGIN 
TOP (T) ; 

IF  T. KINO-TERM INAL  THEN 

IF  T.INFOlxLOOK  THEN  ERROR  ELSE 
BEGIN 

WRITE (SOURCE_READ,LOOK) : X virtual  X 
READ  (SOURCE. LOOK) 

ENO 

ELSE  BEGIN 

ISRHS(R, T.LOOK) : 

PUSH(R) 

END 

END: 

IF  NOT  EMPTY (STACK)  THEN  ERROR: 

ENO. 

An  attempt  to  verify  this  program  succeeds  in  establishing  the  truth  of  4 out  of  the  5 verification 
conditions  generated.  The  following  VC  is  the  only  one  which  doe*  not  simplify  to  TRUE: 

(-£RROR_MSG(l)  a 
EMPTY (STACK)  a 
EMPTY (SOURCE_REAO)  a 
-EMPTY (SOURCE)  a 
SOURCE -SOURCE 0 a 
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STRT  3-<STRT,. KIND. NONTERMINAL  a 

STRT~2-<STRT_3, . INF02.START_SYMB0L>  a 

STACK_7»PUSH_STACK (MAKE  SEQUENCE (STRT  2» ,SOURCE_READ)  a 

STACK_7-MAKE .SEQUENCE (STRT.2)  a 

S0URCE_4-REA0  F_F (SOURCE®, LOOK)  a 

LOOK  4-REA0_X~X  (SOURCE0, LOOK)  a 

L00K_4=FIRST (SOURCE0)  a 

S0URCE_4=REST (SOURCE0)  a 

EMPTY (S0URCE_3)  a 

SOURCE0- APPEND ( SOURCE _RE AO _2, LOOK  3)  a 
I SDER I V (STRT.2, CONCAT ( I M8E0 (S0URCE_REA0_2) , STACK.G) ) a 
EMPTY (STACK.6) 

ISOERI V(STRT_2, IMBED (SOURCE0) ) ) 


One  way  to  prove  this  formula  is  to  show  that 

imbed(sourceO)  - concat(imbtd(sourcejreadJl),  stackJB). 


Given  that  source  3 and  stack  6 are  both  empty  this  means  showing 

imbed{ap pend(source .read _2,  look _3))  - imbed(source_read_2). 

But  unfortunately,  this  VC  is  false;  it  cannot  be  proved  from  any  set  of  consistent  rules. 
Consequently,  the  VC  reveals  an  error  In  our  program. 

Investigating  further,  we  find  that  the  unproved  verification  condition  comes  from  the  path 
which  starts  at  the  entry  assertion  of  the  main  program,  goes  to  the  main  loop,  and  then  to  the 
exit  assertion  of  the  main  program.  Looking  at  our  program  closely  we  find  that  in  fact  the  main 
loop  is  not  coded  correctly.  In  the  case  where  we  read  the  last  token  from  source  into  look  the 
main  loop  will  terminate.  However,  we  haven't  yet  made  the  necessary  reductions  to  derive  the 
entire  input  string. 


Having  found  this  error  we  change  the  program  to  the  following  one: 


X Main  program  X 

INITIAL  SOURCE -5OURCE0 1 

ENTRY  -ERR0R_MSG(1)  a EMPTY (STACK)  a 

EMPTY (SOURCE.RE AD)  a -EMPTY (SOURCE ) i 
EXIT  -ERR0R_MSG(1)  - ISDERIV(STRT,  IMBED! SOURCE 0) ) i 
BEGIN 

STRT.KIND*-NONTERMINALj  STRT.  INF02*-START_SYMB0L| 

PUSH (MAKE  SEQUENCE (STRT) ) ( 

READ (SOURCE. LOOK) i 
DONE --FALSE  i 

INVARIANT  -ERROR  MSG(l) 

((-DONE  - SOURCE0-CONCAT  (SOURCE  READ. CONI  (LOOK.  SOURCE) ) ) a 
I SOER I V (STRT. CONCAT ( I MBEO (SOURCE.REAO) . STACK) ) a 
(DONE  - (EMPTY (SOURCE)  a SOURCE0-SOURCE_READ) ) ) 

WHILE  NOT  DONE  00 
BEGIN 
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TOP (T) j 

IF  T.K I NONTERMINAL  THEN 

IF  T. 1NF01*L00K  THEN  ERROR  ELSE 
BEGIN 

UR I TE (SOURCE_READ,LOOK) s X virtual  X 

IF  EOF  (SOURCE)  THEN  DONE-TRUE  ELSE  READ  (SOURCE , LOOK) 

END 

ELSE  BEGIN 

ISRHS (R,T,LOOK)j 
PUSH(R) 

END 

ENOi 

IF  NOT  EMPTY (STACK)  THEN  ERROR j 
END. 

This  corrected  program  can  be  verified  using  the  rulefile  given  above.  To  show  that  this  proof  is 
not  at  all  trivial  we  include  one  of  the  unsimplified  VCs: 

(-’DONE  a 
(-ERRORJISG(l) 

(-CONE 

* SOURCE0-CONCAT (SOURCE  READ, CONI (LOOK, SOURCE) ) ) a 
ISOERI V(STRT,CONCAT (lMBEQ(SOURCE_REAOI .STACK))  a 
(DONE 

EMPTY (SOURCE)  a 
SOURCE0-SOURCE  READ))  a 
(EMPTY (STACK) 

ERROR  J1SG ( 1 ) ) a 
(-CMPTY  (STACK) 

-♦ 

STACK-CONI (T  0,STACK_2))  a 
T_0-TOP_X(T, STACK)  a 
STACK_2«T0P_STACK (T, STACK)  a 
T_0.K I NO=TERMINAL  a 
T_0. INFOlxLOOK  a 
ERRORJISG(l)  a 
-CRROR_MSG(l) 

(-CONE 

-* 

SOURCE0-CONCAT  (SOURCE  READ, CONI  (LOOK,  SOURCE)') ) a 
ISOERI V(STRT,CONCAT ( I MBEO (SOURCE  READ), STACK  2))  a 
(DONE 

EMPTY (SOURCE)  a 
SOURCE0-SOURCE_READ) ) 
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I.  DIFFERENCES  FROM  STANDARD  PASCAL 


The  verifier  accepts  most  of  the  constructs  of  Pascal,  modified  In  some  cases  for  assisting 
verification.  What  follows  is  a list  of  the  known  differences  between  the  language  accepted  by  the 
verifier  and  "standard"  Pascal  as  presented  in  Jensen  and  Wirth  [15];  this  list  does  not  discuss  the 
syntax  or  semantics  of  the  rule  language  also  accepted  by  the  parser. 


1.1  Comments 

The  scanner  for  all  code  ignores  statements  surrounaed  by  percent  (X)  signs.  Thus,  comments  may 
be  added  to  code  in  this  manner. 


1.2  Program  files 

The  Pascal  code  begins  with  the  word  PASCAL.  The  last  character  in  the  file  should  be  a 
period  (.).  An  end-of-file,  except  from  the  terminal,  is  accepted  in  lieu  of  a final  period.  A main 
program  need  not  be  present.  Procedures  must  have  a body,  but  it  can  be  empty. 


l.S  Procedure  definitions 

The  GLOBAL,  INITIAL,  ENTRY,  and  EXIT  statements  (in  that  order)  may  follow  a 
PROCEDURE  or  FUNCTION  statement.  The  first  three  are  optional;  the  last  one  must  be 
there.  For  example: 

PROCEDURE  P(VAR  X:  INTEGER;  Y:  REAL); 

GLOBAL  (A;  VAR  Z); 

7.Here  the  global  Z may  be  changed  by  this  procedure; 
the  global  A may  be  referenced  by  this  procedure.!! 

INITIAL  X-XO.Z-ZO;  7.X0  and  YO  may  appear  only  in  assertions.!! 

ENTRV  FOOfV  V AV 

EXIT  MUMBLE(X,XO,Y)  a BUMBLE(A.ZO); 

TYPE 

VAR 

BEGIN END, 

Functions  may  not  have  an  INITIAL  statement. 
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In  the  outermost  block  of  a program,  the  ENTRY  and  EXIT  assertions  appear  immediately 
preceding  the  BEGIN  that  starts  the  block.  The  order  is  ENTRY  then  EXIT,  or  just  EXIT.  An 
INITIAL  statement,  If  present,  precedes  the  ENTRY/EXIT  assertions.  Thus: 

PASCAL 

TYPE 

VAR  

PROCEDURES  AND  FUNCTIONS 

EXIT  MUMBLE(A.B); 

BEGIN 

%main  block  of  the  program* 

END. 


1.4  Assertions 

The  ASSERT,  COMMENT,  and  ASSUME  documentation  statements  have  been  added  to  Pascal 
for  verification  purposes. 

ASSERT  <formula> 

COMMENT  <formula> 

ASSUME  <formula> 

The  ASSERT  statement  breaks  a proof  into  two  separate  verification  conditions.  The 
COMMENT  statement  does  not  cause  a break,  but  adds  an  additional  fact  (which  must  be 
verified)  to  the  verification  condition.  The  ASSUME  statement  does  not  cause  a break;  it  adds 
an  additional  assumption  to  the  verification  without  requiring  proof.  For  futher  details  see 
Appendix  C. 

Each  repetitive  statement  requires  an  Invariant  to  be  specified.  Thus: 

INVARIANT  <formula>  WHILE DO 

FOR INVARIANT  <formula>  DO 

REPEAT UNTIL INVARIANT  <formula> 


1.5  Blocks 

As  in  PASCAL,  declarations  must  appear  in  the  order  LABEL,  CONSTANT  (or  CONST), 
TYPE,  VAR,  functions  and  procedures.  Unlike  Pascal,  you  may  have  more  than  one  CONST. 
TYPE,  or  VAR  statement. 
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1.6  Types 

A previously  known  integer  or  real  variable  identifier  may  appear  as  one  of  the  array  bounds  for 
the  purpose  of  defining  an  array  type  by  subrange. 

Variant  records  and  sets  have  not  been  implemented  Functions  and  procedures  may  not  be 
passed  as  parameters. 

The  type  CHAR  has  been  implemented,  but  only  character  constants  one  character  long  may 
appear  in  programs.  Packed  arrays  are  not  implemented.  The  character  delimiter  is  the  single- 
quote e g.,  ’a*. 

A TYPE  may  not  be  redefined  as  another  TYPE  within  its  scope.  It  may  be  redefined  as  a 
constant,  var,  procedure,  or  function;  however  that  makes  the  type  invisible  within  the  scope  of 
that  redefinition  and  will  cause  a syntax  error  if  any  attempt  is  made  to  reference  vars  of  the 
redefined  type. 


1.7  Functions 

There  is  very  strict  enforcement  of  rules  to  ensure  that  functions  have  no  side-effects.  The 
following  are  prohibited  in  functions  (not  procedures): 

VAR  parameters 
The  NEW  statement 
Calling  procedures  that  change  globals 
Changing  globals 

READ,  WRITE,  and  REWRITE  statements 

Note  that  a reference  class  is  a global.  Thus,  assigning  to  any  dereferenced  pointer  will  cause  an 
error. 


1.8  Input/Ouput 

The  only  I/O  statements  allowed  are  EOF,  READ,  REWRITE,  and  WRITE.  EOF  takes  an 
entity  of  type  FILE  and  returns  TRUE  or  FALSE.  READ  and  WRITE  each  take  only  two 
arguments;  the  first  is  a file  and  the  second  Is  an  entity  of  the  same  base  type  as  the  file.  Fites 
may  be  declared  in  the  usual  manner;  however,  an  entity  of  tvpe  FILE  may  appear  in  executable 
code  only  In  the  READ,  WRITE,  and  REWRITE  statements  (or  be  passed  as  a parameter). 
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1.9  Global  variable* 

Any  global  variable  that  can  be  changed  in  a procedure  must  appear  in  a GLOBAL  statement 
for  that  procedure,  In  a list  preceded  by  VAR.  Any  global  variable  that  can  be  referenced  in  a 
procedure  must  appear  In  a GLOBAL  statement  for  that  procedure,  either  preceded  by  VAR  or 
not.  It  will  generally  lead  to  VC*  which  are  easier  to  prove  if  a global  is  not  preceded  by  VAR 
unless  required. 

Any  global  variable  that  Is  referenced  by  a function  must  appear  in  the  GLOBAL  statement  for 
that  function.  Functions  may  not  have  VAR  global  variables. 

Global*  passed  as  parameters  to  another  procedure  are  checked  to  be  in  the  appropriate 
GLOBAL  list  of  the  first  procedure. 

A variable  in  your  program  may  not  have  the  same  name  as  a function,  predicate,  or  record  field. 
However,  the  same  name  may  be  used  as  a record  field  in  two  different  types  of  records. 


1.10  Virtual  variables  and  Passive  statements 

The  word  VIRTUAL  may  precede  the  word  VAR  in  a declaration  or  a procedure  or  function 
parameter  definition.  In  addition,  the  word  VIRTUAL  may  proceed  a non-VAR  parameter 
definition.  VIRTUAL  entities  may  appear  In  documentation  (ENTRY,  EXIT,  ASSERT, 
COMMENT,  ASSUME,  PASSIVE)  or  they  may  be  passed  to  other  virtual  entitles.  They  may 
not  be  used  elsewhere. 

The  PASSIVE  statement  has  been  added  to  permit  assignment  to  virtual  variables.  It  Is  merely 
an  assignment  statement  preceded  by  the  keyword  PASSIVE.  This  Is  the  only  way  in  which  a 
virtual  variable  may  be  assigned  to. 


I.ll  Operator  precedence 

The  precedence  for  operators  appearing  in  documentation  and  rules  is  different  than  in  Pascal. 
In  particular,  there  are  many  more  levels  of  precedence.  The  symbol  "v"  is  the  lowest  priority, 
then  "a”,  and  so  on,  in  what  seems  to  be  a natural  ordering  (the  specific  ordering  is  contained  in 
the  syntax  charts).  For  this  reason,  the  symbols  used  in  documentation  to  represent  the  logical 
operators  are  different  than  the  AND,  OR,  and  NOT  of  Pascal.  For  this  purpose,  documentation 
is  the  formulas  following  INVARIANT,  COMMENT,  ASSERT,  ASSUME,  ENTRY,  and 
EXIT. 

A limited  form  of  type  checking  is  performed  in  all  documentation  statements  noted  above.  A 
variable  appearing  within  a statement  must  be  declared  and  known;  expressions  must  make  sense 
(thus  addition  cannot  be  performed  on  a Boolean  variable,  for  example).  However,  there  Is  no 
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requirement  that  function  and  predicate  names  be  known.  Parameters  to  these  functions  and 
predicates  are  not  checked.  The  exception  to  this  is  the  PASSIVE  statement,  which  must  meet 
the  same  (stricter)  type-checking  requirements  as  the  assignment  statement. 

As  part  of  the  no  side-effect  enforcement,  the  verification  condition  generator  checks  to  ensure 
that  the  same  data  may  not  be  passed  to  a procedure  in  two  different  ways.  This  situation  is 
signalled  by  a syntax  error. 


1.12  Union  types 

The  construct  UNION  has  been  added  to  replace  variant  records.  UNION  is  a general  type 
constructor  which  can  be  combined  with  other  types  in  the  same  way  as  ARRAY  and  RECORD. 
There  Is  a TAG  function  for  determining  the  tag  of  a union  variable,  and  there  are  selection  and 
construction  functions. 

The  UNION  type  declaration  has  the  form 

TYPE  untype  - UNION  al:  tl; . . . ; an:  tn  END; 

where  the  ti  are  types  and  the  al  are  constants  of  an  enumerated  type  or  integer  subrange.  If  the 
ai  are  of  an  enumerated  type,  the  type  must  have  been  declared  previously,  and  each  of  its 
elements  must  appear  once  in  the  UNION  declaration. 

Assuming  that  u and  ul  are  variables  of  a union  type  untype  (above)  and  x is  a variable  of  one 
of  the  ti  types,  then  the  following  operations  are  defined: 

VAR  u,  ul:  untype; 
x:  tl; 

SELECTION  u:ai  returns  the  ai  component  of  u. 

At  any  time,  only  one  of  the  components  of  u exits.  Selection  of  u:a(  is  an  error  If  the  tag  of  u is 
not  ai. 

TAG  function  TAG(u)  returns  one  of  the  constants  ai,  the  current  tag. 

CONSTRUCTORS  untype:al(x)  returns  a value  of  untype  with  tag  al. 

As  a consequence  of  the  declaration  of  untype,  separate  constructor  functions  are  defined  for  each 
of  the  ai.  The  constructor  untype:ai  takes  values  of  type  ti  and  converts  them  into  values  of  the 
union  type. 
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ASSIGNMENTS 


u ul; 

U:ai  x;  valid  only  if  TAG(u)«ai 
u :■  untype:ai(x); 

u :■  x;  implicitly  applies  construction 


Assignment  to  a union  variable  of  a value  of  the  same  type  is  always  permitted.  An  assignment 
to  a component  of  a union  variable,  as  in  the  second  statement,  is  permitted  only  if  that 
component  currently  exists  in  u.  In  the  third  statement,  u is  set  to  the  union  value  constructed 
from  the  value  of  x.  The  fourth  statement  is  equivalent  to  the  third  one:  the  parser  determines 
from  the  mismatch  between  the  types  of  u and  x,  that  the  constructor  untype:ai  must  applied. 


Example:  The  data  structure  and  basic  operations  of  LISP  defined  in  Pascal  with  union  types. 
PASCAL 

TYPE  TAGS  - (A,D,N)i 
LISP  - tUj 
DTPR  - RECORD 

CAR:  LISP: 

CDR:  LISP 

END: 

ATOM  - RECORD 

VALUE:  LISP: 

PLIST:  LISP 

END: 

U - UNION 

D:  DTPR: 

A:  ATOM; 

N:  INTEGER 

END  t 

PROCEDURE  CONSIX.Y:  LISP:  VAR  RESULT:  LISP): 

GLOBAL  (VAR  0U) : 

EXIT  TAG(RESULTt)-D  a RESULTED. CAR-X  a RESULTED. CDR-Y: 

VAR  CELL:  DTPR: 

BEGIN 

NEW (RESULT): 

CELL. CAR: -X: 

CELL.CDR: -Y: 

RESULT t:-U:0 (CELL) 

END: 

FUNCTION  CAR(X:  LISP):  LISP: 

GLOBAL  (MJ): 

ENTRY  TAG (XT) — D: 

EXIT  TRUE: 

BEGIN 

CAR:-Xt:D.CAR 

END: 
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PROCEDURE  PLUS(X,Y:  LISP«  VAR  RESULT:  LISP) i 

GLOBAL (VAR  #U) i 

ENTRY  TAG(Xt)-N  a TAG(Yt)-N» 

EXIT  TAG(RESULTt)-N  a RESULTt:N-Xt:N+Yt:N; 

BEGIN 

NEU (RESULT) j 
RESULTt: -Xt:N+Yt:Ni 

X note  Implicit  application  of  UsN () 


ENO: 


convert  INTEGER  to  type  U X 


to 


t 
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A system  command  consists  of  a command  keyword,  possibly  followed  by  some  arguments,  and  a 
terminating  The  semicolon  must  always  be  present.  Most  command  keywords  can  be 
abbreviated  to  an  initial  substring  that  identifies  the  command  unambiguously. 

There  are  four  classes  of  commands: 

(1)  Imperative  commands,  which  call  the  various  parts  of  the  verifier: 

(a)  READ.  READVC  and  PRINTVC  for  reading  (parsing)  and  writing  files 
in  user-readable  format; 

(b)  SIMPLIFY  and  RESIMPLIFY  for  calling  the  theorem  prover; 

(c)  DUMP  commands  and  LOAD  commands  for  writing  and  reading  files  In 
internal  format; 

(d)  DELRFILE,  DELRULE  for  selective  deletion  of  rulefiles  and  rules. 

(2)  commands  that  set  system  parameters:  ALIAS,  SET,  RESET,  OPENFILE, 
CLOSE. 

(3)  commands  for  obtaining  some  sort  of  Information  from  the  system:  HELP,  SHOW, 
STATUS. 

(4)  commands  for  system  control:  QUIT,  LISP. 

The  following  sections  describe  the  command  syntax  informally;  the  formal  syntax  is  given  In 
Appendix  A. 


2.1  Imperative  commands 

Most  of  the  Imperative  commands  take  a file  name  as  an  (optional)  argument.  The  syntax  of  file 
names  is  exactly  the  same  as  at  monitor  level.  Unless  specified  otherwise,  the  system  will  assume 
unit  DSK:  and  the  current  default  PPN  (see  also  the  ALIAS  command).  Some  commands  will 
assume  default  file  names  if  parts  of  a file  name  are  omitted.  The  defaults  for  file  names  are 
explained  in  the  description  of  the  individual  commands.  In  order  to  override  a default  extension 
an  empty  extension  can  be  forced  by  e.g.,  FOO.[X,BAZJ. 

READ  commands 

A READ  command  parses  source  code  (rulefiles,  programs,  VCs),  i.e.,  input  in  external  format. 
Input  is  read  either  from  the  keyboard  or  from  a file.  The  system  determines  from  the  keyword 
(the  first  word  in  the  file)  what  kind  of  data  it  is  reading.  It  announces  what  It  is  doing,  and 
gives  the  names  of  the  VCs  and  rules.  A READ  command  takes  a file  name  as  (optional) 
argument.  If  no  argument  is  given,  reading  is  done  from  the  keyboard.  The  command  READ  is 
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used  for  parsing  Pascal  source  code  and  rulefiles.  The  command  READVC  is  for  reading  In 
VCs  in  external  format.  The  command  READ  also  knows  about  VC-flles,  i.e.,  the  command 
"READ  FOO.VC;"  is  equivalent  to  "READVC  FOO.VC;".  Examples: 


READ; 

READ  FOO.BAR; 
READVC  FOO; 
READ  FOO.VC; 


parses  source  code  typed  In  from  the  terminal; 
parses  the  file  FOO.BAR; 

parses  the  file  FOO.VC,  assuming  that  it  contains  VCs; 
does  exactly  the  same. 


A READ  command  will  not  accept  files  with  the  extensions  CRL,  CVC,  or  CTB.  Those  files 
have  to  be  read  into  the  verifier  using  a LOAD  command. 

PR1NTVC 

The  command  PRINT  VC  prints  out  VCs,  either  to  the  terminal  or  to  a file  (or  both).  It  takes  a 
VC-specification  and  a file  name  as  (optional)  arguments.  If  no  file  name  is  given,  printing  is  to 
the  terminal.  The  syntax  of  the  arguments  is  the  same  as  for  SIMPLIFY  (see  below  for 
examples). 


SIMPLIFY  commands 

The  command  SIMPLIFY  calls  the  theorem  prover.  The  prover  attempts  to  simplify  one  or  more 
VCs,  using  the  rules  that  are  currently  loaded.  The  command  takes  a VC-specification,  a file 
name  and  system  parameter  settings  as  (optional)  arguments.  If  no  VCs  are  specified,  all  current 
VCs  are  taken.  If  a file  name  is  given,  output  is  to  that  file;  a copy  can  also  be  displayed  on  the 
terminal.  If  no  file  name  is  given,  output  is  to  terminal  only.  A list  of  system  parameter  settings 
(in  parentheses)  may  appear  either  right  after  the  command  keyword  or  at  the  end  (before  the  ; ). 
The  command  can  be  abbreviated  to  "S".  Examples: 


SIMPLIFY; 

S (TRACE, PROOFDEPTH-5); 
SIMPLFOO  l(SHOWGOAL); 
SIMPL  TO  FILE.EXTfA.FOO]; 
SIMPL  -*  FILE.EXT[A,FOO>, 
SIMPL  MAIN  COPY  TO  AAA; 


simplify  all  current  VCs  and  display  them  on  the 
terminal; 

simplify  all  VCs  with  TRACE  turned  on  and 
PROOFDEPTH  set  to  5; 
simplify  VC  I of  FOO  and  display  subgoals 
during  the  proof; 

simplify  current  VCs  and  write  simplified  VCs 
onto  file  FILE.EXTfA.FOOl 
same  as  previous  example,  V may  be  used 
instead  of  "TO"; 

simplify  VCs  of  MAIN;  write  simplified  VCs 
onto  file  AAA  and  display  on  terminal. 


The  RESIMPLIFY  command  takes  the  last  VC  returned  by  the  simplifier  and  has  another  go  at 
it.  Sometimes  this  will  have  a beneficial  effect. 


DUMP  commands 

The  group  of  DUMP  commands  includes  the  commands  DUMP,  DUMPVC,  and  DUMPRULE. 
A DUMP  command  produces  a file  containing  VCs  (DUMPVC),  or  rules  (DUMPRULE),  in 
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internal  format  so  that  at  some  later  time  they  can  be  loaded  directly  using  a LOAD  command 
without  requiring  parsing.  All  DUMP  commands  use  default  file  names.  If  no  file  name  is  given 
as  argument,  the  default  file  name  is  VERIFY  with  an  extension  that  depends  on  the  command: 
CVC  for  VCs,  CRL  for  rulefiles.  These  standard  extensions  are  always  used  when  a file  In 
interna)  format  is  being  created  unless  the  user  explicitly  specifies  a different  (or  empty)  extension. 
The  short  command  DUMP  dumps  both  VCs  and  rules  to  appropriately  named  files;  the 
argument  to  DUMP  must  be  a simple  file  name  without  extension  or  PPN.  It  is  advisable  - and 
convenient  - to  make  use  of  the  default  extension  feature  as  the  LOAD  commands  also  know 
about  them.  Examples: 


DUMPVC  FOO; 
DUMPVC  FOO.BAR; 
DUMPVC  FOO.IP.PRO]; 
DUMPRULE  FOO; 
DUMP  FOO; 


write  a file  FOO.CVC  containing  current  VCs; 
write  a file  FOO.BAR  containing  current  VCs; 
write  a file  FOOtP.PRO]  containing  current  VCs; 
write  a file  FOO.CRL  containing  current  rules; 
write  files  FOO.CVC  and  FOO.CRL  containing 
current  VCs  and  rules,  respectively. 


If  more  than  one  ruleflle  exists,  DUMPRULE  will  dump  the  one  which  was  most  recently  parsed. 
A particular  rulefile  may  be  specified  for  dumping  by  giving  Its  name  as  a second  (optional) 
argument.  Example: 


DUMPRULE  FILE,  SRULES;  dump  rulefile  SRULES  onto  file  FILE. 


LOAD  commands 

The  group  of  LOAD  commands  includes  the  commands  LOAD,  LOAD  VC,  and  LOADRULE. 
A LOAD  command  reads  in  a file  which  was  previously  created  by  a DUMP  command.  LOAD 
commands  use  the  same  conventions  for  naming  files  as  the  DUMP  commands.  If  no  file  name  is 
specified  for  LOAD,  or  If  no  extension  is  specified,  all  loadable  files  with  the  default  name 
(VERIFY)  and  default  extensions  (CVC,  CRL)  will  be  used.  The  "long"  commands  load  a file 
with  an  extension  corresponding  to  their  suffix.  Examples: 


LOAD  VC  FOO; 
LOAD  FOO.CVC; 
LOAD  VC; 

LOAD  FOO; 


loads  the  file  FOO.CVC; 
does  exactly  the  same; 
loads  the  file  VERIFY.CVC; 
loads  whichever  (or  all)  of  the  files 
FOO.CVC  and  FOO.CRL  exist. 


DELETE  commands 

Rulefiles  and  rules  can  be  deleted  selectively  by  the  commands 

DELRFILE  <ltst  of  rulefiles>;  for  rulefiles,  and 

DELRULE  <list  of  rule  names>;  for  rules. 

The  command  DELRFILE  without  argument  deletes  all  rules  (so  beware!). 


A 
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2.2  Setting  system  parameters 
ALIAS 

The  ALIAS  command,  like  the  monitor  command,  changes  the  project/programmer  name  (PPN) 
the  verifier  uses  as  default.  It  affects  all  file  input  and  output  to  and  from  the  verifier. 
Examples: 

A LIAS  VER.FOO;  changes  the  default  PPN  to  iVER.FOOl 
A LIAS;  prints  out  the  current  default  PPN. 

SET.  RESET 

The  user  can  set  the  values  of  various  parameters  that  control  the  system.  Parameter  values  can 
be  changed  either  for  the  rest  of  the  session  with  the  SET/RESET  commands  ("sticky"  changes), 
or  temporarily  in  imperative  commands.  SET  accepts  as  argument  a list  of  parameters  and 
values;  if  no  parameter  value  is  given  to  SET,  it  uses  T (for  T RUE).  RESET  sets  parameters  to 
their  default  values;  this  command  accepts  only  a list  of  parameter  names  as  argument.  Examples: 

SET  TRACE,  PROOFDEPTH-5; 

RESET  TRACE; 

Sequences  of  SET  command  operands  in  parentheses  may  be  Included  in  a command  string  either 
after  the  keyword  or  at  the  end  preceding  (for  examples  see  the  explanation  of  SIMPLIFY). 
The  difference  between  setting  parameters  this  way,  or  using  SET/RESET,  is  that  SET  and 
RESET  settings  are  permanent;  settings  given  in  a command  string  apply  only  for  the  duration  of 
the  command  execution.  If  the  same  parameter  name  occurs  twice,  the  first  setting  is  overwritten. 
The  type  of  parameter  value  expected  depends  on  the  parameter  name.  The  following  list  gives 
user  adjustable  parameters  with  the  type  of  their  values: 


natnum:  an  integer  greater  l 
bool:  a LISP  flag:  T or 

ASSERTDEPTH 

natnum 

CASEDEPTH 

natnum 

DEPTHTALK 

bool 

PROOFDEPTH 

natnum 

RULE 

bool 

SHOWFACT 

bool 

SHOWGOAL 

bool 

SHOWTEST 

bool 

SUMMATCH 

bool 

TERMINAL 

bool 

TRACE 

bool 

TRACEFACT 

bool 

TRACEVC 

bool 

maximum  forward  assertion  depth; 

maximum  depth  of  nesting  of  forward  cases; 

signal  whenever  a depth  bound  is  reached; 

maximum  backward  proQf  depth; 

enable  rulehandler;  r‘ 

enable  assertion  display  during  proof; 

enable  subgoal  display  during  proof; 

enable  display  of  tests  made  during  proof; 

enable  special  sum  matching:  extra  subspace  instance; 

if  set,  file  output  (from  SIMPLIFY  and  PRINTVC) 

is  also  displayed  on  the  terminal. 

enable  proof  tracing; 

enable  display  of  assertions  made  in  trace  output; 
enable  display  of  intermediate  VCs  during  proof 
(works  only  If  TRACE  Is  set); 
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Current  default  values  can  be  found  using  the  SHOW  command. 

1 

OPENFILE 

The  command  OPENFILE  opens  a backup  file.  A backup  file  gets  a copy  of  all  the  output  from  I 

the  system  that  Is  displayed  on  the  terminal.  It  takes  a file  name  as  (optional)  argument;  the 
default  file  name  is  VERIFY.BKP.  If  the  parameter  (NOT)  is  included  after  the  file  name,  ■ 

output  goes  to  the  file  only.  A backup  file  can  be  closed  using  the  command  CLOSE.  Example: 

OPENFILE  FOO  (NOT);  j 

Note  that  output  cannot  go  to  two  different  files  simultaneously.  Thus  the  backup  file  has  to  be  ^ 

closed  before  PRINTVC  or  SIMPLIFY  can  write  onto  other  files,  or  another  backup  file  can  be 
opened.  The  system  will  notify  you  if  this  Is  necessary. 

CLOSE 

The  command  CLOSE  closes  a backup  file.  The  command  takes  no  argument. 


2.3  Query  commands 
HELP 

The  HELP  command  provides  information  about  various  system  features.  It  takes  a keyword  as 
argument.  "HELP  " gives  some  general  information  about  the  verifier  and  pointers  to  further 
information.  "HELP  WHAT;"  gives  the  list  of  topics  for  which  help  Is  available. 

SHOW 

The  SHOW  command  displays  the  current  values  of  system  parameters.  It  takes  a list  of 
parameter  names  (separated  by  commas)  as  argument.  If  no  arguments  are  given,  SHOW 
displays  the  values  of  all  parameters  the  system  knows  about. 

STATUS 

The  STATUS  command  prints  out  a list  of  names  of  VCs,  rutefiles  and  rules  currently  loaded.  It 
takes  no  arguments. 


2.4  System  control 
QUIT 

The  QUIT  command  is  provided  to  allow  one  to  exit  gracefully  from  the  verifier.  "QUIT;"  will 
return  you  to  the  monitor. 

LISP 

Typing  "LISP;"  to  the  system  gets  the  user  to  the  Maclisp  toplevel.  This  command  exists 
primarily  for  system  maintenance  and  test;  the  uninitiated  user  should  never  need  to  use  it.  Once 
at  LISP  toplevel,  evaluating  (RESUME)  will  return  control  to  the  verifier  command  level. 
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Running  the  system 

When  loading  the  system,  it  will  print  out  some  more  or  less  useful  messages.  As  soon  as  the 
prompt  character  ">"  appears,  the  system  is  ready  to  accept  commands.  The  system  tries  to  be 
fairly  talkative;  when  executing  a command  it  always  prints  out  something.  Thus,  If  the  prompt 
character  appears  the  system  expects  more  input  before  It  can  execute  the  command  (for  example, 
the  terminating  may  have  been  omitted).  All  file  manipulation  is  announced  to  the  user, 
including  full  file  names. 

Error  recovery 

If  for  some  reason  or  other  the  system  ends  up  with  a LISP  error,  evaluating  (RECOVER)  will 
return  control  to  verifier  command  level.  Typing  <control>  P will  do  exactly  the  same.  If  the 
error  occurred  In  the  simplifier,  It  will  be  reinitialized  automatically. 


S.  DESCRIPTION  OF  THE  SIMPLIFIER 


3.1  Introduction 

The  prover  has  two  components,  a simplifier  and  a rulehandler  (which  is  described  in  Part  II, 
Chapter  4).  The  simplifier  finds  a normal  form  for  any  expression  over  the  language  consisting 
of  individual  variables,  the  usual  boolean  connectives,  equality,  the  numerals,  the  arithmetic 
functions  and  predicates  +,  -,  s,  and  <,  the  LISP  constant  and  functions  NIL,  CAR,  CDR  and 
CONS,  the  functions  ARRAYSTORE  and  ARRAYSELECT  for  storing  Into  and  selecting  from 
arrays,  the  functions  RECORDSTORE  and  RECORDSELECT  for  storing  Into  and  selecting 
from  records,  and  uninterpreted  function  symbols.  Individual  variables  range  over  the  union  of 
the  reals,  the  set  of  arrays,  the  set  of  records,  LISF  list  structure  and  the  booleans  TRUE  and 
FALSE. 

The  simplifier  is  complete;  that  is,  It  simplifies  every  valid  formula  to  T RUE.  Thus  It  is  also  a 
decision  procedure  for  the  quantifier-free  theory  of  reals,  arrays,  records,  and  list  structure  under 
the  above  functions  and  predicates. 

The  following  are  some  examples  of  simplifications: 

2 + 3*5 
17 

P =>  - P 
-’P 

X - f(x)  o f(f(x))  - f(f(f(x))) 

TRUE 

xsyAy  + dsxAS*dz  2*d  o v(2*x  - y]  - v[x  + d] 

TRUE 

The  simplifier  includes  a number  of  cooperating  special  purpose  provers,  each  a decision 
procedure  for  a particular  quantifier-free  theory.  For  instance,  there  is  one  prover  for  arithmetic, 
one  for  arrays,  etc.  Each  prover  has  some  modifications  for  use  in  the  verifier;  some  of  the 
modifications  are  temporary  and  reflect  only  the  present  version. 


3.2  Prover  for  arithmetic 
The  axioms  of  this  theory  are: 
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x + 0 - x 
x + - x - 0 

(x  + j)  + Z - X + (j  + z) 

x + y ■ y + x 

X S X 

x s y v y s x 

X S y AJSX3X-J 
XJJIAJISPXSI 
X J J 3 X + t 2 J + I 
0*1 
0 5 I 

The  numerals  2,  3,  . . . and  2 are  defined  in  terms  of  0,  I,  +,  - and  s In  the  usual  way.  We  also 
allow  multiplication  by  Integer  constants;  for  instance,  2 * x abbreviates  x + x. 

The  integers,  ratior,tal$  and  reals  are  ail  models  for  these  axioms.  Any  formula  which  is 
unsatisflable  over  tjhe  rationals  or  reals  can  be  shown  unsatisfiable  as  a consequence  of  these 
axioms.  Thus  our' simplifier  is  complete  for  the  rationals  or  reals.  It  is  not  complete  if  the 
variables  range  over  the  integers,  since  there  are  unsatisfiable  formulas,  such  as  x + x - 5,  which 
cannot  be  shown  unsatisfiable  as  a consequence  of  the  above  axioms.  The  reason  for  the 
incompleteness  is  that  determining  the  unsatisfiability  of  a conjunction  of  integer  linear 
inequalities  — the  integer  linear  programming  problem  — is  much  harder  in  practice  than 
determining  the  satisfiability  of  a conjunction  of  rational  linear  inequalities.  This  incompleteness 
is  not  as  bad  as  it  seems,  since  most  formulas  that  arise  in  program  verification  do  not  depend  on 
subtle  properties  of  the  Integers. 

t ■ 

In  the  present  version,  we  have  implemented  one  useful  heuristic  which  makes  the  simplifier  no 
longer  sound  fof  reals  or  rationals  but  which  catches  much  of  the  incompleteness  concerning 
integers.  In  addition  to  s,  we  allow  < as  a predicate  symbol,  but  define  x < y to  be  x + I i y. 

Notice  In  the  description  of  the  simplifier  that  multiplication  is  NOT  mentioned  although  it 
appears  in  the  examples.  At  the  moment,  we  allow  expressions  such  as  2 * x and  there  is  some  ad 
hoc  code  which  tries  to  capture  the  more  obvious  properties  of  multiplication  by  constants,  but  the 
code  makes  no  pretence  of  being  complete.  (The  quantifier-free  theory  of  Integers  under  addition 
and  multiplication  is  undecidable.) 


3.3  Record  prover 

The  record  prover  handles  expressions  Involving  storing  into  and  selecting  from  records  and 
record  fields.  The  following  axioms  are  implemented: 
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<r,  .f,  r.f>  - r 

«r,  -f,  el>,  .f,  e2>  ■ <r,  .f,  e2> 

<r,  .f,  e>.f  - e 
<r,  .f,  e>.g  - r.g 

With  one  exception,  these  are  the  axioms  of  the  quantifier-free  theory  of  records.  The  one  axiom 
that  is  not  implemented  in  the  record  prover  concerns  permutation  of  terms  within  data  triples-, 
that  is,  the  axiom  «r,  .f,  el>,  .g,  e2>  - «r,  .g,  e2>,  .f,  el>.  The  reason  for  'his  omission  is  that 
this  axiom  can  lead  to  combinatorial  explosion.  It  appears  to  be  rarely  nev^ary  In  proofs  and 
can  be  Included  as  a rule  if  necessary. 

The  record  prover  can  be  turned  on  and  off  from  LISP,  Evaluating  (RECORDPROVER)  turns 
it  on  (and  Is  the  default);  (NORECORDPROVER)  turns  it  off. 


3.4  Array  prover 

The  array  prover  Implements  the  following  axioms  for  arrays: 

<a,  (il,  a[i]>  - a 

«a,  Cl],  e I >,  [I],  e2>  - <a,  [i],  e2> 

<a,  il],  e>[j]  - (If  I - J then  e else  a[jj) 

Again  the  axiom  for  permutations  within  data  triples  is  missing.  There  are  at  the  moment 
problems  with  the  array  prover  in  the  verifier;  because  of  an  interface  problem  with  the 
rulehandler,  it  is  running  much  too  slowly  and  requiring  too  much  workspace.  For  this  reason  the 
arrayprover  In  the  simplifier  is  temporarily  defaulted  to  be  off.  It  can  be  turned  on  In  two  ways 
from  LISP.  Evaluating  (FASTARRAYPROVER)  turns  on  a version  which  implements  the  first 
two  axioms  above  plus  the  axiom  <a,  [i],  e>[i]  - e;  it  therefore  lacks  the  axiom  I * J s <a,  [I],  e>[J] 
- a[J].  Evaluating  (SLOWARRAYPROVER)  turns  on  a version  which  Implements  the  three 
axioms  above.  (NOARRAYPROVER)  turns  the  array  prover  off  and  is  the  default. 


3.5  List  structure  prover 

Since  Pascal  does  not  have  LISP  list  structure,  the  LISP  special  purpose  prover  has  thus  far  not 
been  turned  on  in  the  Pascal  verifier. 

3.6  Remarks 

Complete  descriptions  of  the  various  parts  of  the  simplifiers  and  the  component  special  provers 
appear  In  122, 23, 241 
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4.  THE  RULE  LANGUAGE 


4.1  Introduction  to  rules 

We  give  an  informal  description  of  the  rule  language.  A precise  description  of  the  syntax  is 
given  in  Appendix  A:  this  section  is  intended  as  a brief  introduction  to  rules. 

There  are  two  types  of  rules:  forward  rules  and  backward  rules.  Roughly  speaking,  forward  rules 
add  new  facts  to  the  data  base  of  the  theorem  prover  as  a consequence  of  old  facts.  Backward 
rules  specify  sets  of  subgoals  which  may  be  used  in  proving  goals  set  up  by  the  theorem  prover. 
Some  rules  may  cause  "case  splitting,"  which  is  the  separation  of  a proof  search  into  multiple 
contexts  for  the  purpose  of  considering  cases. 

For  each  kind  of  rule,  we  give  a brief  description  of  the  syntax,  logical  meaning,  and  semantics. 
The  logical  meaning  specified  is  the  "strongest"  logical  fact  expressed  by  the  rule.  The  semantics 
describe  how  this  fact  will  be  used  by  the  theorem  prover  in  proofs. 

Certain  conventions  are  used  in  the  description  below.  Brackets  in  a syntactic  description  indicate 
an  optional  expression.  A LITERAL  is  an  atomic  formula  or  a negated  atomic  formula.  A 
TRIGGER-EXPRESSION  is  an  expression  which  contains  no  propositional  operators  and  which 
is  not  an  individual  variable.  A REPLACEMENT-EXPRESSION  is  an  expression  which 
contains  no  propositional  operators.  An  expression  is  an  expression  in  the  assertion  language. 
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4.1.1  Backward  rules 
SYNTAX 

infer  A [from  B]  [whenever  TR-I  , TR-2  , . . .] 

A:  (the  INFER  clause)  a conjunction  of  literals. 

B:  (the  FROM  clause)  a conjunction  of  literals. 

TR-i:  a trigger-expression. 

The  WHENEVER  and  FROM  clauses  are  optional.  If  there  is  no  FROM  clause,  B Is  defaulted 
to  T RUE.  If  there  is  a WHENEVER  clause,  it  must  have  at  least  one  trigger. 

LOCICAL  MEANING 

B 3 A 

SEMANTICS 

A backward  rule  "applies"  when  the  prover  is  trying  to  prove  any  of  the  literals  in  the  INFER 
clause.  If  the  FROM  clause  can  be  proved,  the  INFER  clause  is  assumed  to  be  proved.  Multiple 
rules  interact  through  standard  subgoaling  techniques.  If  A is  the  propositional  constant  FALSE, 
a contradiction  will  be  derived  if  the  FROM  clause  can  be  proved.  Triggers  in  the 
WHENEVER  clause  restrict  situations  in  which  the  rule  will  be  applied  to  those  in  which 
instances  of  each  trigger  have  occurred  as  subterms.  Proof  of  the  literals  in  the  FROM  clause 
proceeds  from  left  to  right. 

EXAMPLES 

infer  A div  B S N from  A<N  a Bfcl 

infer  Ordered (a, i , j I from  Orderedla, i ,k)  a Ordered (a, k, j ) 

infer  ISDERIV(X.MAKE_SEQUENCE(Xn 

infer  ISOERI V (X.CONCAT (Z.CONCAT (R, T) ) ) from 

ISOERl V (X.CONCAT (APPENO(Z.l) , T)  I a ISPROO(L.R) 

X Two  rules  for  verifying  a context  free  parser! 

ISDERIV(X.Y)  means  there  is  a derivation  from  X to  Y| 

ISPROO(L.R)  means  there  is  a production  from  L to  R.  X 
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4.1.2  Forward  rules:  FROM  rules 
SYNTAX 

[whenever  TR-I  , TR-2  , . . .]  from  B Infer  A 

[whenever  TR-I  , TR-2  , . . .]  from  B Infer  cases  CASE-1  ; CASE-2  ; ...  end 

A:  (the  INFER  clause)  a conjunction  of  literals. 

B:  (the  FROM  clause)  a conjunction  of  literals. 

TR-i:  a trigger-expression. 

CASE-i:  a CASE  (see  below). 

If  there  Is  a WHENEVER  clause,  it  must  have  at  least  one  trigger.  A CASE  must  be  one  of  the 
following  two  forms: 

C -*  D or  C 

where  C and  D are  conjunctions  of  literals.  In  the  second  case,  D is  defaulted  to  be  TRUE. 
There  must  be  at  least  one  case  In  a CASES  clause. 

LOGICAL  MEANING 

B = A 

B o [(C-l  a D-l)  v (C-2  a D-2)  v . . . ) 

SEMANTICS 

When  all  of  the  WHENEVER  triggers  have  been  instantiated  (there  may  be  none),  and  when  all 
of  the  literals  In  the  FROM  clause  have  become  true,  the  INFER  clause  Is  asserted.  If  the  INFER 
clause  is  a conjunction  of  literals,  they  are  all  asserted.  If  the  INFER  clause  is  a CASES  construct, 
a case  split  is  required.  The  actual  split  is  delayed  as  long  as  possible  (since  a split  is  potentially 
expensive)  but  is  done  before  any  backward  rules  are  applied.  A case  of  the  split  may  be 
eliminated  during  proof  (but  before  the  split  Is  actually  done)  when  Its  C-l  formula  (the  formula 
to  the  left  of  the  arrow)  becomes  false.  If  all  but  one  of  the  cases  are  eliminated,  no  split  is  done; 
instead  the  remaining  case  is  asserted  Immediately. 

EXAMPLES 

from  P(S)  infer  -QIX.Y.S) 

X When  PIS)  ie  true,  Q(X,Y,S)  is  faiat,  for  all  X and  Y X 

whenever  A*B  from  A20  a B*0  Infer  A«B20 

whenever  X/Y  from  Y*0  Infer  X-Y*(X/Y) 

whenever  MIN(I,J,K)  from  TRUE  Infer 

cases  IsJaIsK  -*  Ii  JsIaJsK  -*  Jj  KsIaKsJ  •*  X end 
X riIN(l,J.K)-l  If  |jj  and  IsK  . . . X 
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4.1.S  Forward  rules:  REPLACE  rules 
SYNTAX 

replace  TR  [where  A]  by  RP 

replace  TR  [where  A)  by  cases  C-l  -♦  RP-I  ; C-2  RP-2  ; . . . end 

A:  (the  WHERE  clause)  a conjunction  of  literals. 

C-i:  a conjunction  of  literals. 

TR:  a trigger. 

RP:  a replacement. 

RP-I:  a replacement. 

The  WHERE  clause  is  optional.  If  there  is  no  WHERE  clause,  A is  defaulted  to  TRUE.  If  there 
is  a CASES  clause,  it  must  have  at  least  one  case. 

LOGICAL  MEANING 

A = (TR  - RP) 

A = [(C-l  a TR  - RP-I)  v (C-2  a TR  - RP-2)  v . . .] 

SEMANTICS 

When  an  Instance  of  TR  appears  in  the  data  base  and  the  WHERE  clause  has  become  true,  then 
do  the  action  specified  by  the  BY  clause.  If  the  BY  clause  is  a replacement,  then  an  equality  (or 
equivalence)  between  TR  and  RP  is  asserted.  If  the  BY  clause  is  a CASES  clause,  a split  is 
propagated.  The  two  rules  given  in  the  syntax  specification  are  equivalent  to  the  following  two 
FROM  rules: 

whenever  TR  from  A infer  TR  - RP 

whenever  TR  from  A Infer  cases  C-l  -♦  TR  - RP-I  ; C-2  -»  TR  - RP-2  ; end 

EXAMPLES 

replace  X 01 V 1 by  X 
X Division  by  1 X 

replace  A*B  by  B*A 

X Commutativity  of  multiplication.  This  rule  will  not  loop.  X 

replace  <A,  II], E> l J]  by  cases  l-J  -*  Ej  I w J -*  AIJJ  end 
X Array  data  structure  term  simplification  X 

replace  SIGN(X)  by  cases  X*0  li  X<0  -♦  -1  end 

X Will  cause  a split  If  neither  X*0  nor  X<0  can  be  ehoun  X 
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4.1.4  Rulefiles 

SYNTAX 

RULEFILE(name) 

[constant  CS-I,  CS-2, . . . ;] 

[pattern  PT-I,  PT-2, . . . ;] 

RN-I:  RULE-1;  RN-2  : RULE-2  .P 

RULE-1:  a rule. 

RN-i:  an  identifier  which  will  name  the  rule. 

CS-i:  an  identifier  which  is  to  be  a pattern  constant. 

PT-x:  an  identifier  which  is  to  be  a pattern  variable. 

The  CONSTANT  and  PATTERN  specifications  are  optional.  All  identifiers  appearing  In  the 
rulefile  are  assumed  to  be  pattern  variables  except  those  used  as  function  or  predicate  names,  or 
as  record  field  identifiers.  These  defaults  can  be  overridden  using  the  CONSTANT  and 
PATTERN  declarations. 

SEMANTICS 

A rulefile  is  a collection  of  rules.  More  than  one  rulefile  can  be  active  In  the  theorem  prover  at 
once.  Each  rule  and  each  rulefile  must  have  a unique  name.  Thus  rules  or  rulefiles  can  be 
replaced  by  reading  new  rules  or  rulefiles  with  identical  names;  old  rules  or  rulefiles  with  the  same 
name  are  deleted.  The  order  rules  appear  in  the  file  is,  more  or  less,  the  order  In  which  they  will 
be  applied  by  the  theorem  prover. 

EXAMPLE 

rulef i le(sample) 

constant  NULL,  EMPTY,  C0NST1,  C0NST2: 

X Declare  various  identifiers  to  be  pattern  constants  X 

CONST i from  TRUE  infer  C0NST1*C0NST2» 

X Assert  that  CONS Tlx CONST 2 to  the  data  base  X 

1NEQ:  infer  X>8  from  X«<0  a X*0j 

X Rules  like  this  may  be  required  sometimes  X 

APNULL i replace  APPEND (NULL. Xl  by  Xt 
X NULL  is  a constant  X 

GINFO:  replace  G(X)  where  X. INFO-EMPTY  by  NULL: 

X INFO  Is  a record  field  Identifier, 

and  therefore  not  a pattern  variable  X 
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4.1.5  Switches  and  parameters 

There  are  various  parameters  and  switches  for  controlling  the  proof  search  and  tracing.  Several 
of  these  are  depth  bounds  which  allow  the  user  to  constrain  the  search  In  various  ways.  The 
SHOW  switches  are  particularly  useful  for  debugging  'uleflles.  In  the  default  case,  all  trace 
switches  are  off.  The  SHOW  command  (see  Section  2.3)  can  be  used  to  determine  the  default 
settings  of  the  depth  bounds. 

DEPTHTALK  (switch)  — If  this  switch  is  set  to  true,  the  prover  will  print  a message  whenever  it 
reaches  a depth  bound  during  search. 

PROOFDEPTH  (integer)  — This  value  is  approximately  the  maximum  depth  of  nesting  of 
backward  rules. 

ASSERTDEPTH  (integer)  — This  value  Is  approximately  the  maximum  depth  of  nesting  of 
assertions  made  by  forward  rules. 

C.ASEDEPTH  (integer)  — Approximately,  the  maximum  number  of  forward  case  splits  which 
will  be  allowed.  All  others  will  be  ignored.  This  value  does  not  include  splits  which  are 
eliminated  due  to  case  reduction. 

TRACE  (switch)  — If  this  switch  Is  set,  a proof  summary  will  be  printed  after  simplification  of  a 
verification  condition. 

TRACEFACT  (switch)  — If  TRACE  and  TRACEFACT  are  both  set,  the  proof  summary  will 
include  a listing  of  facts  asserted  by  forward  rules. 

TRACE  VC  (switch)  — An  intermediate  version  ("presimplified")  of  the  theorem  to  be  proved  will 
be  printed.  This  version  Is  the  result  of  simplifying  the  formula  In  the  presence  of  no  rules.  This 
output  is  useful  for  interpreting  the  TRACE  results. 

SUMMATCH  (switch)  — If  this  switch  is  set,  additional  specific  instances  will  be  generated 
during  matching  of  sums.  The  use  of  this  switch  is  described  in  Section  4.2.13. 

SHOWFACT  (switch)  --  This  switch  will  cause  the  prover  to  display  facts  asserted  by  forward 
rules  during  simplification.  Some  of  these  facts  may  be  asserted  in  Inconsistent  contexts,  and  may 
be  false. 

SHOWGOAL  (switch)  — The  theorem  prover  will  display  subgoals  (from  backward  rules) 
generated  during  a proof  if  this  switch  is  set.  This  feature  Is  useful  during  development  of 
rulefiles  and  assertions  in  programs.  Some  successful  subgoals  will  not  be  displayed,  because  they 
are  proved  by  TEST  (see  Section  4.2. 7). 

SHOWTEST  (switch)  — The  theorem  prover  will  show  all  Instantiated  literals  which  are 
TESTed  during  proof  If  this  switch  is  set  (see  Section  4.2.7). 
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4.1.6  An  example 

Here  Is  a sample  rulefile  and  two  proofs  which  make  use  of  it.  The  rulefile  Is  not  particularly 
efficient,  though  It  does  demonstrate  several  features  of  the  rule  language.  The  verification 
conditions  come  from  an  insertion  sort  program.  The  TRACE,  TRACEFACT,  and  TRACEVC 
switches  have  been  set.  It  took  three  seconds  to  prove  INSERTSORT  2,  and  seven  seconds  to 
prove  INSERTSORT  3.  If  only  the  rules  ORD3  and  ORD9  are  Included,  proof  of 
INSERTSORT  3 takes  only  two  seconds. 


RULEFILE (INSERT) 

% Rulefile  for  insertion  sort  X 


PERM! : INFER  Permutat ion (I , I ) : 

PERM2:  INFER  Permutat ion (Exchange (A, I ,J) .6)  FROM  Permutation(A,B) j 
PERM3:  REPLACE  «P1 , CP2) ,P1  (P3) >, (P3I ,P4>  8Y  Exchange (<P1, IP2I ,P4>,P2,P3)  j 

0ATA1 : REPLACE  <A,  (J),X>[K1  UHERE  K-J  BY  X; 

0ATA2:  REPLACE  <A,  IJ) . A [J] > BY  A: 


ORDli 
0R02: 
0RCI3 : 
0R04 : 
0RD5: 
CIRD6: 
0RD7: 
0RD8: 
0RD9: 


ARR: 


INFER  Ordered (K, I , J)  FROM  liJs 

INFER  Ordered (K, I , J)  FROM  OrderedOC,  I ,L)  a OrderedOC. L, J) » 

INFER  OrderedOC, I , J)  FROM  OrderedOC, L,M)  a LsI  a JsMi 
INFER  Ordered (<K, (J) ,E>, I , L)  FROM  I»J  a EsKII+l)  a Ordered (K, 1+1, L) j 
INFER  Ordered (<A,  (1)  ,A[I-1I>,1,J)  FROM  1<I  a l<J  a Ordered(A,l,  J)  i 

INFER  Ordered(<A,  ti) , A U -1 J > . 1 . J)  FROM  I-J  a OrderedlA.l, J-l)j 

INFER  Ordered(<IC,  tJI  ,E>.  I ,L)  FROM  J-L  a KIL-llsE  a OrderedOC,  I ,L-1) ; 
INFER  Ordered(A,l, I)  FROM  Ordered(A,l, I -1 ) a AtllfcAII-l] j 
INFER  OrderedUIC,  III  ,E>,  J.L)  FROM  J<I  a 1<L  a OrderedOC,  J,  I -1)  a 
OrderedOC,  1+1, L)  a KU-lIsE  a EsKII+ll: 

INFER  ICILIfilCfMJ  FROM  OrderedOC,  I ,J) a IsL  a L<M  a MsJi 
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Unsimplified  Verification  Condition:  INSERTSORT  2 

(ORDEREO (K,l, J)  a 
JsN  A 

x<Kii+n  a 

0<I  A 
!<J  A 

PERMUTATION (<K,  IIJ,X>.K0)  a 
0<l  -1 

"*  mkii-dsxi  a 

K_2-<K,  tl-l+lJ,iai-ll> 

ORDERED (K  2.1. J)  a 
JsN  a 

X<K_2[I-1+1I  a 
0< I -1  A 
I -1 <J  A 

PERMUTATION (<K_2, tl-l],X>,K0))l 


Presimplified  Verification  Condition:  INSERTSORT  2 

( ORDERED (K, 1,J)  a 
JsN  a 
X<K[1+I]  a 
0<  I A 
l<J  A 

PERMUTATIONUK,  (I)  ,X>,K0)  a 
2s!  a 
K II  -1)  >X  a 
K_2-<K,  m,Kll-lJ> 

ORDERED (K_2,1,J>  a 
X<K_21IJ  a 

PERMUTATION(<K_2, 11-1J ,X>,K0) ) 


Simplified  Verification  Condition:  INSERTSORT  2 
TRUE 


58 


Part  II:  Chapter  4:  The  Rule  Language 


Proof  summary  for  INSERTSORT  2. 

Assertions  made  by  rules: 

DATA1 : K 2 U) -K (1-1) 

PERM3:  <K_2,  (1-1) ,X>-EXCHANGE(<K, ID, X>, 1. 1-1) 

Top  level  goal  ORDERED (K_2,l, J) 

Proof  from  backuards  rule  0RD5. 

Subgoal  K<K_2 (I ] 

Proved  without  backwards  rules. 

Top  level  goal  PERMUTATION (<K_2, 11-11  ,X>,K0) 

Proof  from  backwarde  rule  PERM2. 


End  of  proof  summary  for  INSERTSORT  2. 


Unsimpllfled  Verification  Condition:  INSERTSORT  3 

(ORDERED (K.l.J)  a 
J<N  A 

X<K  ll+l)  a 
0<I  A 
I < J A 

PERMUTATIONUK,  II]  ,X>,K0)  a 
0<I  -1 

(K  (1-11  sX 

(K_1-<K. (1-1+1) ,K> 

ORDEREO  (K  1.1.  (J+D-l)  a 
J+lsN+1  a 
2sJ+l  a 

PERMUTATIONS  1.K0)))) 


Presimpl I f led  Verification  Condition:  INSERTSORT  3 

(ORDEREO (K.l.J)  a 
JsN  a 

X<K  (1  + 1 ) a 
0<I  A 
I<J  A 

PERMUTATION (<K, (I) ,X>,K0)  a 
2sl  a 

Ktl-l)sX  a 
K_1-<K,  m,x> 
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ORDERED  (KJ.l.JI) 


Simplified  Verification  Condition:  INSERTSORT  3 
TRUE 


Proof  eummary  for  INSERTSORT  3. 

Top  level  goal  ORDERED (K_1,1,J) 

Proof  from  backwards  rule  0RD9. 
Subproofs! 

Subgoal  ORDERED (K,l  + 1,  J) 

Proof  from  backwards  rule  0RD3. 

Subgoal  ORDERED (K,l, I -1) 

Proof  from  backwards  rule  0RD3. 


End  of  proof  summary  for  INSERTSORT  3. 


4.2  Using  the  rule  language 

In  this  section  we  describe  techniques  Tor  writing  rules.  The  primary  purpose  of  the  rule 
language  is  to  allow  users  of  the  verifier  to  supply  lemmas  to  the  theorem  prover.  By  providing 
the  necessary  rules,  the  user  can  effectively  extend  the  assertion  language  to  include  new  concepts. 

For  example,  let  Ordered(A,l,j)  mean  that  the  array  A is  ordered  In  the  Interval  [l,Jl  By  giving 
suitable  rules,  Ordered  can  be  used  in  assertions  in  programs,  and  the  theorem  prover  can  be 
expected  to  prove  a large  variety  of  valid  verification  conditions  Involving  Ordered. 

Suppose  we  wish  to  express  the  following  fact  about  Ordered: 

(*)  (VA,l,J)  (Ordered(A,l+  l,J)  a l<J  a A[i]sA[i+ 1]  o Ordered(A,l,J)). 


That  is,  if  the  array  A is  ordered  in  [i+l,j],  and  A[i]  is  not  greater  than  the  smallest  element  of  A 
in  the  interval  (namely,  Ail-t- 1}),  then  A is  ordered  in  [l,j]. 

It  would  be  nice  if  we  only  had  to  provide  logical  statements  like  (*),  and  proofs  of  valid 
verification  conditions  were  forthcoming.  However,  the  theorem  prover  does  not  have  much 
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heuristic  knowledge,  and  uses  only  the  simplest  methods  to  search  for  proofs.  Therefore,  when  we 
provide  a logical  statement  to  the  prover,  we  must  tell  it  how  to  make  use  of  that  fact. 

We  start  by  distinguishing  the  two  main  types  of  rules.  Then,  a short  description  of  the  theorem 
proving  algorithm  is  given.  This  provides  the  background  for  a more  complete  discussion  of  the 
differences  between  the  two  types  of  rules.  Following  this,  some  details  are  given  about  the 
ordering  of  proof  search,  to  help  the  user  improve  the  efficiency  of  his  rules. 

Pattern  matching  is  then  discussed.  The  matcher  used  in  the  rulehandler  makes  use  of  semantic 
knowledge  in  certain  domains. 

Several  sections  follow  which  describe  various  specific  features  of  the  rule  language.  Included 
among  these  features  are  rule  schemata,  a device  for  controlling  application  of  rules  through  the 
use  of  the  matcher,  case  splitting,  for  doing  proof  by  considering  cases,  and  semantic  matching. 

Finally,  some  general  advice  is  given  on  efficiency  considerations. 


4.2.1  Forward  and  backward  rules 

Here  is  one  way  (*)  can  be  expressed  to  the  theorem  prover: 

Rl:  INFER  Ordered(A ,i,j)  FROM  l<J  a Ordered(A,i+l,j)  a A[i]sA[i+IJ; 

This  has  the  effect  of  saying:  "If  you  are  trying  to  prove  that  A is  ordered  in  [l,j],  for  any  A,  I, 
and  j,  then  first  prove  that  i< j,  then  Ordered(A,i+!,J),  and  finally,  A[i]<A[i+IJ." 

Here  Is  another  way  of  expressing  (*): 

R2:  FROM  Ordered(A,i+l,j)  a i<j  a A[l]<A[i+l]  INFER  Ordered(A ,i, j); 

That  is,  for  any  A,  i,  and  J,  If  you  know  that  Ordered(A,l+l,J),  i<J,  and  A[l]sA{i+IJ  are  all  true, 
then  you  can  assert  Ordered(A,l,j).  An  equivalent  way  of  writing  R2  Is: 

R2A:  FROM  Ordered(A,l,J)  a Isj  a A[i-l]sA[l]  INFER  Ordered<A,i-l,J); 

Rules  like  Rl  are  called  BACKWARD  rules;  rules  like  R2  are  called  FORWARD  rules.  One 
way  to  think  about  backward  rules  is  that  they  work  backward:  setting  up  subgoals  from  goals. 
Similarly,  forward  rules  appear  to  work  forward  from  assertions,  generating  new  assertions. 
Backward  rules  may  be  compared  to  PLANNER  consequent  theorems;  forward  rules  to 
antecedent  theorems.  Thus,  though  they  may  have  the  same  logical  meaning,  they  are  applied 
differently  in  the  search  for  a proof. 
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4.2.2  The  theorem  prover 

This  section  will  provide  a rough  idea  of  how  the  rulehandler  of  the  theorem  prover  works. 

In  the  theorem  prover,  many  proofs  can  be  accomplished  without  rules,  since  decision  procedures 
for  various  theories  (including  equality  and  Presburger  arithmetic)  are  built  Into  the  simplifier. 
The  built  in  theories  are  described  in  Part  II,  Chapter  3. 

The  prover  tries  to  prove  theorems  by  deriving  contradictions  in  a data  base.  Thus,  If  we  are 
trying  to  prove  A/\B  = C,  it  asserts  A and  B,  then  asserts  the  negation  of  C,  and  finally  tries  to 
show  that  this  data  base  describes  an  impossible  situation.  For  example,  suppose  we  want  to 
prove  x-y  a J\x)~f{y).  We  first  assert  x«y.  Then  we  assert  the  negation  of  the  conclusion: 
f(x)*f{y).  But  by  the  properties  of  equality,  these  assertions  cannot  both  be  true,  so  the  theorem  Is 
proved. 

This  method  may  be  likened  to  the  standard  "Truthtable*  method  for  simplifying  propositional 
formulas,  In  that  all  possible  assignments  are  considered  for  the  propositional  variables,  x«y"  and 
»-M’  For  each  of  these  assignments,  we  must  show  either  that  the  formula  simplifies 
propositionally  to  TRUE,  or  that  semantically  me  given  assignment  Is  impossible;  that  is,  it 
describes  a contradiction.  Thus,  in  the  example,  there  were  four  cases  to  consider.  Three  of  them 
reduced  to  TRUE  propositionally.  The  fourth,  assigning  TRUE  to  x-y  and  FALSE  to  /(*)-/(}) 
resulted  in  a contradiction,  eliminating  that  case  from  consideration.  If  we  could  not  have 
eliminated  this  case  semantically,  the  formula  would  not  simplify  to  TRUE,  because  this  case 
represented  propositionally  is  TRUE  a FALSE,  or  FALSE.  Thus  data  base  contexts  always 
represent  conjunctions  of  literals;  each  literal  is  positive  or  negative  depending  on  the  truth-value 
assignment  in  the  current  (non-tautological)  case. 


4.2.3  Forward  rules 

Forward  rules  typically  assert  new  facts  to  the  data  base  as  a consequence  of  old  facts.  For 
example,  the  rules  FROM  BaA  INFER  D and  FROM  D INFER  C are  used  to  prove  the 
verification  condition  AaB  o C In  the  following  way:  Initially,  the  data  base  is  empty,  and  the 
rules  are  "waiting"  for  instances  of  B and  D to  be  asserted.  First,  A is  asserted  to  the  data  base, 
followed  by  B.  After  B Is  added,  the  first  rule  "fires"  and  waits  now  for  an  instance  of  A to  be 
asserted  (literals  in  a FROM  clause  are  considered  from  left  to  right).  A is  already  in  the 
database,  so  the  rule  immediately  continues  and  asserts  an  instantiated  D to  the  data  base.  The 
state  of  the  data  base  at  this  point  may  be  represented  by  AaBaD. 

After  D is  asserted,  the  second  rule  "fires"  and  asserts  an  instantiated  version  of  C.  Finally,  -C  is 
asserted  from  the  verification  condition,  and  a contradiction  Is  now  evident:  AaBaDaCa-C. 
Thus,  AaB  = C has  been  proved  using  the  two  rules. 

To  prove  AaB=>CaD,  multiple  data  base  contexts  would  be  used.  First  a contradiction  would  be 
derived  from  AaBa-C;  then  a contradiction  would  be  derived  from  AaBa-D,  giving  the  proof. 
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These  two  sub-proofs  share  a subset  of  the  data  base,  AaB.  That  Is,  In  both  cases,  we  have 
T RUE  assigned  to  A and  B.  This  means  that  forward  rules  triggered  by  A or  B will  "fire"  once 
only,  and  the  results  will  go  Into  the  common  data  base. 


4.2.4  Case  splits 

It  Is  possible  for  a rule  to  require  splitting  of  the  data  base  Into  multiple  contexts  for  the  purpose 
of  considering  cases.  For  example,  the  rule, 

CAS:  FROM  A INFER  CASES  B;  C END; 

Indicates  that  If  A Is  true,  then  BvC  is  true.  That  is,  it  Indicates  a disjunction  between  the 
elements  of  the  CASES  clause.  This  rule  would  be  used  to  prove  the  data  base,  -Ba-CaA, 
inconsistent  In  the  following  manner:  After  A is  asserted,  the  rule  "fires”  and  Indicates  that  a case 
split  Is  required.  Case  splits  are  delayed  as  much  as  possible,  to  take  advantage  of  sharing  of 
common  information  in  the  multiple  contexts.  When  the  case  split  is  done,  two  cases  will  be 
considered.  To  prove  the  theorem,  a contradiction  must  be  derived  from  both  cases.  In  the  first 
case,  B is  asserted  to  the  data  base,  obtaining  -Ba-CaAaB,  which  is  false.  The  other  case, 
-Ba-'CaA  aC,  also  simplifies  to  FALSE,  resulting  in  a proof. 

If  more  than  one  forward  CASES  rule  applies,  requiring  multiple  case  splitting,  the  cases  are 
nested,  so  the  total  number  of  cases  considered  will  be  the  product  of  the  numbers  of  cases 
propagated. 


4.2.5  Backward  rules 

One  way  to  think  about  a backward  rule  In  this  environment  is  to  consider  it  as  the 
contrapositive  of  a forward  rule.  Thus,  the  backward  rule  INFER  C FROM  D could  be 
considered  to  be  equivalent  to  FROM  ->C  INFER  -D.  Now  suppose  we  write  a backward  rule 

Cl:  INFER  A FROM  BaC; 

The  contrapositive  of  BaC  o A Is  -’A  o -Bv-’C.  This  could  be  written  as  the  forward  rule 
C2:  FROM  -A  INFER  CASES  -B;  -C  END; 

From  these  two  examples,  it  appears  that  all  backward  rules  can  be  translated  Into  equivalent 
forward  rules,  is  there  any  difference,  In  fact,  between  forward  and  backward  rules?  There  is, 
and  it  will  become  apparent  when  we  see  how  the  system  deals  with  more  than  one  rule.  Here  are 
two  backward  rules  for  Ordered: 

ORD1:  INFER  Ordered(a,i,J)  FROM  l<J  a Ordered(a,l,J-l)  a a[J-l]sa[J]; 

ORD2:  INFER  Ordered(a,i,J)  FROM  i<j  a Ordered(a,i+l,J)  a a(i]sa[i+l]; 
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Suppose  we  are  trying  to  prove  Ordered(B,M,N).  The  proof  will  go  as  follows:  We  try  to  use  rule 
ORDI  first,  since  it  appears  first.  There  are  three  cases  to  consider,  corresponding  to  the  three 
literals  in  the  FROM  clause  of  the  rule.  These  cases  are  tried  sequentially.  If  all  of  these  cases  do 
not  simplify  to  FALSE,  we  abandon  attempts  at  this  rule  and  go  on  to  consider  the  case  split 
required  by  the  rule  ORD2,  which  also  has  three  cases.  Thus  we  will  try  at  most  six  cases,  at  least 
two  (one  from  .ach  rule). 

Each  of  the  cases  we  try  may  have  subcases,  generated  by  rules  that  become  applicable  due  to  the 
new  assertion  made  to  the  data  base  on  that  case.  The  process  of  applying  backward  rules  and 
splitting  in  this  manner  is  called  SUBGOALING. 

While  backward-style  splitting  is  more  efficient  than  forward-style  splitting,  it  is  not 
COMPLETE,  in  that  forward-style  splitting  may  yield  a proof  In  examples  where  backward-style 
splitting  would  not.  The  reader  should  be  able  to  construct  an  example  to  illustrate  this. 

Thus,  we  distinguish  between  forward,  complete  splitting  and  backward,  incomplete  splitting.  In  a 
given  data  base,  with  forward  splitting,  each  applicable  rule  multiplies  the  maximum  number  of 
cases  considered;  with  backward  splitting,  each  applicable  rule  adds  to  the  maximum  number  of 
cases  considered.  Had  forward  complete  splitting  been  used  with  ORDI  and  ORD2,  at  most  9 
cases  would  have  been  considered,  rather  than  just  6.  For  this  reason,  it  is  desirable  to  use 
backward  splitting  (or  subgoaling)  whenever  possible.  To  illustrate  this:  suppose  there  were  10 
rules  for  Ordered  similar  to  ORDI  and  ORD2,  each  with  three  cases.  If  they  were  backward 
rules,  we  would  consider  at  most  30  cases.  Were  they  forward  rules,  we  would  have  to  consider 
some  3f  10  (that  is,  59049)  cases. 


4.2.6  Ordering  backward  rules 

In  our  proofs,  splits  are  always  delayed  until  ait  other  assertions  have  been  made  to  the  data  base. 
All  backward  rules  are  considered  to  propagate  splits.  This  includes  rules  like  INFER  P FROM 
Qj  which  propagates  a split  with  one  case,  and  rules  like  INFER  P,  which  propagates  a split  with 
no  cases.  The  reader  should  be  able  to  convince  himself  that  the  rules  INFER  P FROM  Q,and 
FROM  ->P  INFER  -Q,are  not  equivalent  for  this  reason:  These  rules  are  logically  equivalent,  but 
not  heuristically  equivalent  because  incomplete  splitting  is  used  in  the  backward  rule. 

When  more  than  one  backward  rule  applies,  rules  are  tried  In  the  order  they  appear  In  the 
rulefile,  the  data  base  of  rules.  By  ordering  rules  carefully,  the  user  can  Improve  the  speed  of  his 
proofs.  Consider  the  following  four  rules: 

N I:  Infer  N(x,y)  from  P(x)  a Qjy); 

N2:  Infer  N(x,y)  from  S(x,y); 

N3:  Infer  N(x,y)  from  N(y,x); 

N4:  infer  N(x,y)  from  N(C(x),C(y)); 

The  "easier,"  non-recursive  rules  appear  first.  When  trying  to  prove  N(A,B),  non-N  subgoals 
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would  be  tried  In  the  following  order  (assuming  none  were  provable):  P(A),  S(A,B),  P(B),  S(B,A), 
P(C(B)),  S(C(B),C(A»,  P(C(A)),  S(C(A),C(B)),  P(C(C(A))),  and  so  on.  Q(x)  Is  never  tried,  since 
the  P(x)  case  always  fails. 

The  rule  N3  will  not  loop  forever.  When  P(A)  and  S(A,B)  fail,  the  rule  sets  up  the  goal  N(B,A). 
After  P(B)  and  S(B,A)  fall  (sub-subgoals  from  Ni  and  N2  on  this  subgoal),  N3  applies  again, 
setting  up  the  goal  N(A,B).  But  "setting  up  a goal"  means  denying  a fact  to  the  data  base.  Since 
-N(A,B)  already  exists  In  the  data  base,  denying  N(A,B)  again  produces  no  effect,  so  no  new  rules 
apply  and  the  next  subgoal,  from  N4,  will  be  considered:  N(C(B),C(A)).  Infinite  looping  could 
arise  from  N4,  however,  unless  there  is  a rule  which  expresses  C(x)-x.  In  general,  it  may  be 
extremely  difficult,  if  not  impossible  to  write  non-recursive  rules  for  certain  concepts.  For  this 
reason,  there  are  "depth-bounds"  or  cut-offs  built  into  the  rule  mechanism  to  limit  search. 

Suppose  the  N rules  had  been  ordered:  N4,  NI,  N2,  N3.  Because  we  use  a depth-first  search 
paradigm,  the  rule  N4  would  be  applied  recursively  until  the  depth  bound  was  reached  before 
any  other  subgoals  were  generated!  Thus,  if  the  depth  bound  were  three,  the  first  subgoal  would 
be  P(C(C(A))). 

In  fact,  strict  depth-first  search  is  not  used;  the  rulehandler  uses  a combination  breadth-first 
depth-first  search:  All  subgoals  at  a level  are  generated.  If  any  of  them  can  be  proved  without 
further  backward  rules,  they  will  not  be  set  up  as  subgoals.  Thus,  even  with  the  bad  order  of 
rules,  there  would  be  no  search  in  the  proof  of  P(A)aQ(A)  a N(A,B). 

Within  a given  rule,  subgoals  are  tried  in  the  order  they  appear  in  the  FROM  part  of  the  rule. 
Thus,  "easier”  literals  should  appear  first,  since,  if  their  proofs  failed,  further  cases  which  may 
involve  more  extensive  searching  will  not  be  tried.  Considerations  such  as  these  would  help  the 
user  decide  how  to  order  the  subgoals  in  the  rule  NI. 


4.2.7  Introduction  to  matching 

Logically,  rules  are  universally  quantified  statements,  with  quantification  over  all  variables  which 
appear.  Thus,  the  rule 

Ml:  FROM  P(x)aQ(x)  INFER  R(x); 

represents  the  logical  statement  Vx[P(x)aQ(x)  a R(x)l 

When  a rule  "fires,"  the  effect  internally  is  to  make  a copy  of  the  rule  with  "constrained" 
quantification.  For  example,  suppose  we  are  trying  to  prove 

P(A)  a Q(B)  a Q(C)  a A-B  a R(B). 

The  first  literal  asserted  to  the  data  base  is  P(A).  At  this  point,  Ml  fires,  and  waits  for  Q(A)  to 
be  asserted.  One  way  of  viewing  this  Is  that  a new  rule, 


65 


Part  II:  Chapter  4:  The  Rule  Language 


MIA:  FROM  Q(A)  INFER  R(A); 

has  been  added  to  the  system,  and  that  to  avoid  duplication,  M I has  now  been  constrained  to  fire 
with  x distinct  from  A in  this  context.  Q(A)  is  fully  instantiated;  In  resolution  terminology,  it  is  a 
ground  literal.  This  means  we  can  TEST  its  validity  directly,  rather  than  merely  waiting  for 
"instances."  Thus,  if  we  had  a forward  rule 

M2:  FROM  TRUE  INFER  Q(x); 

this  rule  would  apply  during  the  test  of  Q(A),  and  MIA  would  assert  R(A).  Had  the  literals  in 
M I’s  FROM  clause  been  reversed,  use  of  M2  would  not  have  been  possible  since  rules  only  apply 
to  literals  which  appear  in  the  data  base  (and  thus  are  ground). 

However,  there  is  no  rule  M2  In  our  example,  so  testing  Q(A)  fails,  and  the  rule  MIA  continues 
to  wait.  The  next  literal  asserted  to  the  data  base  from  the  theorem  is  Q(B).  This  does  not  fire 
any  rules.  P(C)  is  the  next  literal  asserted.  At  this  point  Ml  fires  again,  since  C is  distinct  from 
A,  and  another  virtual  instance  rule  is  created, 

MIB:  FROM  Q(C)  INFER  R(C); 

The  data  base  is  P(A)aQ(B)aP(C).  A-B  is  now  asserted.  At  this  point,  the  rule  M I A fires,  since 
Q(B)=Q(A)  by  the  (built  in)  theory  of  equality,  and  R(A)  is  asserted.  The  data  base  is  now 

P(A)  a Q(B)  a P(C)  a A-B  a R(A). 

Rule  M I is  waiting  for  instances  of  P(x)  where  x is  distinct  from  A.B,  or  C.  Rule  M IB  is  waiting 
for  Q(C)  to  become  true.  Rule  M IA  has  already  fired  for  all  of  its  possible  instances  (only  one). 

Finally,  the  denial  of  the  conclusion  of  the  theorem  is  asserted,  -R(B).  Since  R(A)  and  A-B  are 
both  in  the  data  base,  a contradiction  is  indicated.  Thus,  we  have  proved  the  theorem  using  the 
rule  Ml. 

We  make  several  observations  about  this  proof.  Forward  rules  without  CASES  are  always 
"waiting"  on  some  literal  pattern.  If  this  literal  pattern  is  not  fully  instantiated  (for  example,  P(x) 
in  M I),  the  prover  will  wait  for  instances  to  appear  in  the  data  base.  On  the  other  hand,  if  the 
literal  is  fully  instantiated  (for  example  Q(A)  in  MIA),  the  prover  not  only  waits  for  the  literal  to 
appear,  it  also  "tests"  the  literal  for  validity  in  the  data  base.  This  means  that  in  each  distinct 
context  in  the  data  base,  the  literal  will  be  denied  in  an  effort  to  obtain  a contradiction.  During 
the  test  of  the  literal,  forward  rules  may  be  applied,  resulting  in  proof  of  the  literal  in  the  given 
data  base. 


4.2.8  Ordering  within  rules 

Suppose  we  want  to  prove  A>0  a P(A+I)  a Q(A+I),  and  we  know  (VxXx>0  a p{x)  = q(x)).  There 
are  two  ways  we  could  write  forward  rules  to  express  this  fact: 
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AR  I:  FROM  x>0  a P(x)  INFER  Qfx); 

AR2:  FROM  P(x)  a x>0  INFER  QJx); 

Consider  using  ARI.  When  A>0  is  asserted,  ARI  fires,  and  creates  a virtual  instantiated  rule, 
ARIA:  FROM  P(A)  INFER  QJA); 

In  the  data  base,  since  x was  bound  to  A when  ARI  fired.  With  this  rule,  we  cannot  prove  the 
theorem.  Suppose  we  are  using  AR2  instead.  After  P(A+ 1)  is  asserted,  the  rule  Instantiates  to 

AR2A:  FROM  A+l>0  INFER  <^A+I); 

Testing  A + l>0  succeeds,  since  A>0  is  in  the  data  base,  and  thus  is  known  to  the  built  in 
Presburger  arithmetic  prover.  Therefore,  QJA+1)  is  asserted,  and  the  theorem  is  proved.  This 
illustrates  the  fact  that  the  order  in  which  literals  appear  In  a rule  affects  the  ability  of  the  system 
to  obtain  a proof. 

This  ordering  constraint  also  holds  for  backward  rules  because  cases  are  considered  in  the  order 
they  appear  in  the  rule.  Suppose  we  had  the  rules 

ORD3:  INFER  Ordered(a.x.y)  FROM  x<y  a Ordered(a,x,i)  a Ordered(a,z,y) 

ORD4:  INFER  Ordered(a.x.y)  FROM  x<y  a Ordered(a,z,y)  a Ordered(a,x,z) 

Consider  using  ORD3  to  prove,  say,  Ordered(B,I,J).  The  first  subgoal  Is  I<J,  which  is  fully 
instantiated,  and  thus  will  be  tested  in  the  data  base,  and  may  require  further  backward  rules  for 
proof.  If  it  is  provable,  then  the  rule  waits  for  some  z such  that  Ordered(B,I,z)  exists  In  the  data 
base.  If  it  finds  an  instance,  say  where  z»K,  It  sets  up  the  further  subgoal  Ordered(B,K,J),  which 
is  fully  instantiated  and  thus  may  use  further  backward  rules  for  proof.  Thus,  Ordered(B,I,K) 
must  actually  appear  In  the  data  base,  In  order  to  provide  an  instance  for  z,  while  Ordered(B,K,J) 
need  only  be  derivable  from  rules.  Had  we  used  ORD4,  the  situation  would  have  been  reversed. 
Thus,  the  two  rules  are  not  equivalent,  and  both  may  be  required  for  some  proofs. 

This  ordering  constraint  should  not  be  viewed  as  a weakness  of  the  rutehandler,  since  by  giving 
all  permutations,  it  could  be  circumvented.  Indeed,  it  provides  the  user  with  a way  of  controlling 
proof  search  since  he  can  predict  which  literals  will  be  uninstantiated. 


4.2.9  Rule  schemata:  Whenever  and  Replace 


Suppose  we  desired  to  assert  x*ytO  whenever  we  saw  a product,  x*y  and  it  was  evident  that  «0 
and  yzO.  We  could  write 

MULI:  FROM  X20  a y20  INFER  x*y*0; 


Consider  the  effect  of  this  rule.  Whenever  an  assertion  is  made  In  the  data  base  of  the  form  E*0 
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for  any  expression  E,  both  literals  In  the  FROM  clause  will  match.  Thus,  for  every  pair  E and  F 
where  both  E*0  and  F*0  (possibly  E-F,  of  course),  the  rule  will  fire,  asserting  E*F*0.  This  adds  a 
; new  inequality  assertion  to  the  data  base,  and  so  the  rule  will  match  many  times  more.  In  the  end, 

L many  useless  facts  will  get  asserted  and  much  prover  time  will  be  wasted,  since  the  rule  matches 

indiscriminately. 

We  can  remedy  this  by  using  a device  called  a RULE  SCHEMA,  which  allows  us  to  give  a 
"trigger  pattern."  The  rule 

MUL2:  WHENEVER  x*y  FROM  x>0  a yiO  INFER  x*y*0; 

says  that  whenever  a product,  A*B,  appears  in  the  data  base,  an  Instantiated  version  of  MUL2 
will  appear: 


MUL2A-.  FROM  A*0  a B>0  INFER  A*B>0; 

Here,  all  literals  are  instantiated,  so  the  validity  of  the  FROM  literals  can  be  tested.  Further,  the 
rule  applies  only  to  products  which  actually  appear  in  the  data  base.  Thus,  adding  the 
WHENEVER  clause  weakens  the  rule  by  restricting  its  application  so  it  makes  assertions  only 
about  products  which  appear  in  the  data  base.  However,  the  WHENEVER  clause  also 
strengthens  the  rule  by  causing  the  FROM  literals  to  be  fully  instantiated,  and  thus  subject  to 
testing  in  the  data  base.  That  Is,  the  WHENEVER  rule,  MUL2  would  prove 

AaO  a BzO  a (A+I)*B20 

while  MULI  would  not.  While  they  have  different  heuristic  meanings,  logically,  MUL2  and 
MULI  express  the  same  fact. 

_ / 

One  very  common  application  of  WHENEVER  is  asserting  equalities  between  terms.  For 
example, 

GCDI:  WHENEVER  GCD(x.y)  FROM  x MOD  y - 0 INFER  GCD(x,y)-y; 

Another  way  of  writing  this  rule  is 

GCD2:  REPLACE  GCD(x,y)  WHERE  x MOD  y - 0 BY  y; 

This  rule  is  semantically  equivalent  to  GCDI.  The  "REPLACE"  syntax  Is  used  for  historical 
reasons;  in  fact,  there  Is  no  actual  rewriting  or  replacement  — an  equality  is  asserted.  Thus, 
REPLACE  rules  may  be  viewed  as  statements  of  directional  equalities. 

Because  of  the  structure  of  the  data  base,  rules  like 

TWIST:  REPLACE  F(x,y)  BY  F(y,x>, 

will  cause  no  looping.  Similarly,  replacement  rules  can  be  provided  for  both  directions  of  an 
equality: 
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ASCI:  REPLACE  F(x,F(y.z))  BY  F(F(x,y),z); 

ASC2:  REPLACE  F(F(x,y),z)  BY  F(x,F(y,z)); 

WHENEVER  clauses  may  Include  more  than  one  trigger  pattern.  All  triggers  must  match  before 
the  instantiated  rule  will  appear.  Triggers  are  expressions  which  contain  no  propositional 
operators  and  are  not  individual  variables. 


4.2.10  Levels  of  proof 

Thus  far,  we  have  seen  that  there  are  two  levels  of  Interaction  between  instantiated  literals  in 
rules  and  the  data  base.  A literal  In  a rule  is  a FINAL  literal  If  it  occurs  in  the  FROM  clause  of 
a backward  rule  or  In  the  INFER  clause  of  a forward  rule.  Final  literals  are  those  literals  which, 
when  instantiated,  can  get  asserted  "permanently”  In  a data  base  context,  which  may  be  the 
branch  of  a split.  Since  these  literals  become  part  of  the  data  base,  they  can  cause  other  rules  to 
be  applied,  and  further  splits  to  be  generated.  Thus  they  have  the  same  status  as  literals  which 
actually  occur  in  the  theorem  to  be  proved. 

Literals  which  are  not  final  are  called  TRANSITION  literals.  In  general,  the  prover  waits  for 
transition  literals  to  become  true  before  "firing"  a rule.  When  the  rulehandler  is  attempting  to 
establish  validity  of  an  instantiated  transition  literal,  it  will  test  that  literal  in  the  data  base  at 
various  times.  During  testing,  forward  rules  which  don't  split  may  be  applied,  as  well  as 
knowledge  from  the  built  in  theories.  Presently,  there  is  also  the  restriction  that  when  a transition 
literal  is  being  tested,  nested  tests  will  not  be  done;  that  is,  they  will  fail.  Thus,  the  following  rules 
will  not  work  together  as  expected: 

TRI:  FROM  P(x)  a Q(F(x))  INFER  R(x>. 

TR2:  REPLACE  F(x)  WHERE  S(x)  BY  G(x>, 

The  prover  uses  a process  called  FIND  to  locate  instantiations  for  uninstantiated  literals.  In 
general,  this  means  that  a literal  must  be  found  in  the  data  base  which  matches  the  pattern  literal 
in  the  rule.  In  the  case  of  equalities,  the  process  is  slightly  more  powerful.  If  both  sides  of  the 
pattern  equality  are  uninstantiated,  an  actual  matching  equality  must  be  found  in  the  data  base. 
Otherwise,  when  only  one  side  of  the  equality  is  uninstantiated,  the  prover  will  wait  for  an 
instance  of  this  side  of  the  equality  to  become  equivalent  to  the  value  In  the  data  base 
corresponding  to  the  instantiated  side  of  the  equality  pattern. 

All  uninstantiated  literals  are  proved  with  FIND.  Thus,  if  ATOM(A)  is  asserted, 

ATOM:  FROM  ATOM(x)  INFER  x^CONS(y.z); 

will  cause  the  prover  to  wait  for  an  Instance  of  CONS(y,z)  to  become  equivalent  to  A.  If  such  an 
instance  appears,  a contradiction  will  be  propagated. 

Uninstantiated  literals  should,  of  course,  not  be  single  pattern  variables. 
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4.2.11  Forward  cases 

In  order  to  increase  efficiency  of  proofs,  a case  elimination  mechanism  has  been  built  Into  the 
forward  CASES  construct.  Let  <A,[IlE>  represent  the  array  A after  the  assignment  A [I  J*-E  has 
been  performed.  Thus,  we  have 

<a,[i],e>[l]-e  and  I*j3<a,[i],e>[j]-a[jl 

This  fact  can  be  written  as  a REPLACE  rule 

ARRAY:  REPLACE  <a,[ile>[j]  BY  CASES  i-j  -*  e;  It)  -►  a[J]  END; 

This  rule  is  equivalent  to 

ARRAYI:  WHENEVER  <a,[i),e>[j)  FROM  TRUE  INFER  CASES 
i-j  -»  <a,Ii],e>f j]-e; 

-♦  <a,[ile>[j]-atj]  END; 

Interpret  the  "arrow"  in  the  CASES  clause  as  AND.  Suppose  we  wish  to  prove 

<A.m,A[I]>[J]-A[J]. 

The  rule  splits  and  considers  two  cases 

<A.M,A[I]>[J}M[J]  a I-J  a <A.[HAm>[JJ-AU] 

<A,m.A[I]>[JjM[J]  a IvJ  a <A,tlUm>tJ]-AfJ] 

Both  cases  simplify  to  FALSE,  proving  the  theorem.  In  this  example,  the  case  split  is  required 
for  proof. 

Suppose,  however,  we  were  proving 
«B,[l],2>,[2],Ml]-2. 

If  splits  were  done,  four  cases  would  be  considered,  three  of  which  would  be  eliminated  trivially. 
To  avoid  unnecessary  splitting,  and  unnecessary  delay  of  assertions  of  facts  from  forward  rules, 
cases  can  be  eliminated  dynamically  once  a split  has  been  propagated.  As  soon  as  only  one  case 
remains,  its  facts  are  asserted  Immediately.  In  our  example,  the  rule  ARRAY  first  applies  with 
a-<B,[l],2>,  1-2,  e*3,  J—  I.  The  first  case,  with  I-J  or  2-1,  is  eliminated  by  test  as  soon  as  the  rule 
applies,  causing  the  other  branch  of  the  split  to  be  propagated  as  fact.  Thus  the  data  base 
becomes, 


«B,m.2>.t2l3>tl>2  a 2vl  a «B,(  I ],2>J213>[  I ]-<B,[  I ],2>[  I J. 

At  this  point,  the  rule  applies  again,  with  a-B,  I- 1,  e-2,  J-l.  The  second  case  is  eliminated  by 
test,  and  the  fact  <B,[|],2>ll]-2  is  propagated  causing  an  immediate  contradiction. 
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Thus,  forward  CASES  generally  cause  splitting  only  when  a split  Is  necessary  or  further  splits  are 
required  to  eliminate  cases.  Forward  splits  will  not  occur,  however,  within  the  proof  of  a subgoal 
for  a backwards  rule,  though  case  elimination  may  cause  facts  to  be  asserted. 

For  efficiency  reasons,  only  literals  appearing  before  the  arrow  in  a given  case  are  used  to 
eliminate  other  cases.  Thus,  the  rule 

CAS:  FROM  P INFER  CASES  Q,l  -*  Rl;  Q?  -*  R2;  Q3  -*  R3  END; 

is  equivalent  to  the  set  of  rules 

CASI:  FROM  P a -Q2  a -<£3  INFER  Q.I  a Rl; 

CAS2:  FROM  P a -QJ  a -Q3  INFER  Q?  a R2; 

CAS3:  FROM  Pa^Ia  INFER  Q3  a R3; 

assuming  it  never  splits.  If  any  of  the  QJ  or  Ri  are  not  fully  instantiated  when  the  split  is 
propagated,  further  instantiations  will  occur  only  within  each  case,  not  across  cases.  The  following 
two  rules  are  equivalent: 

CSI:  FROM  P(x)  INFER  CASES  Q.l(x,y);  Q2(x,y)  END; 

CS2:  FROM  P(x)  INFER  CASES  Q,l(x,y);  Q?(x,z)  END; 


4.2.12  Semantic  matching 
Suppose  we  had  a rule 

SM I:  INFER  P(x+ 1)  FROM  P(x)  a Q$x); 

If  we  wanted  this  rule  to  apply  in  proving  P(2+A)  from  P(A+I)  and  Qfl+A),  the  pattern  matcher 
would  need  to  have  some  knowledge  of  properties  of  addition.  We  call  this  type  of  matching 
Semantic  Matching.  The  matcher  used  in  the  rulehandler  makes  use  of  properties  of  addition, 
multiplication,  arithmetic  relations,  and  equality.  The  matcher  asaumes  that  all  variables 
appearing  in  sums  and  products  are  integer  valued.  This  is  conservative  in  the  sense  that  no 
additional  matches  are  obtained  by  the  assumption,  while  many  are  eliminated. 

Properties  of  addition  and  multiplication  used  are  commutativity,  associativity,  identity,  and  in  the 
case  of  addition,  multiplication  by  constants.  In  the  case  of  the  relational  operators,  the  integer 
assumption  makes  X>0  and  X+1>0  equivalent.  In  fact,  the  prover  stores  all  inequalities  of  a 
given  sign  Internally  in  the  form  E>0  for  some  expression  E.  This  means  that  the  pattern  F(x)*y 
will  match  A + B<F(C)+G(D),  binding  x to  C and  y to  A4B-G(D)+I.  Note  that  only  a negated 
inequality  pattern  will  match  a negated  inequality  in  the  data  oase,  however. 

Equality  matching  makes  use  of  the  symmetric  and  substitutive  properties  of  equality.  Thus,  the 
rule 
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EQJ:  INFER  x-y  FROM  P(x,y); 
will  prove  P(B,A)  3 A-B. 

Often,  patterns  will  not  match  as  soon  as  the  target  is  found  because  facts  asserted  later  in  proof 
are  required  for  the  match.  For  example,  in  proving  P(A)  a A-F(B)  3 QjB)  with  the  rule 

tl:  FROM  P(F(x))  INFER  Q(x>, 

the  rule  applies  only  after  A»F(B)  has  been  asserted  to  the  data  base.  For  efficiency  reasons,  this 
sort  of  "waiting"  does  NOT  take  place  with  semantic  patterns.  Thus,  P(A*B)  a A-F(C)  3 Q(B,C) 
will  not  be  proved  by 

Q2:  FROM  P(x*F(y))  INFER  Qjx.y); 

while  A»F(C)  a P(A*B)  3 Q(B,C)  will.  This  limitation  is  not  a serious  one  in  practice,  and  may 
be  circumvented  by  using  a WHENEVER  clause,  as  in 

Q3:  WHENEVER  F(y)  FROM  x*F(y)  INFER  Q(x,y); 


4.2.13  Subspace  matching 

When  matching  a pattern  like  x+y  against  a sum,  it  is  possible  that  many  distinct  matches  will 
result.  For  this  reason,  certain  sum  matches  produce  "subspace"  specifications  as  their  result.  For 
example,  matching  x+y  against  A+B+C  produces  the  specification  x+y--(A+B+C),  which 
represents  a linear  equation  with  variables  x and  y.  When  x or  y appear  in  further  patterns,  they 
will  be  considered  to  be  unbound,  except  subject  to  the  constraint  of  this  equation.  Multiple 
constraints  are  me  ->d  using  Gaussian  Elimination  over  the  integers.  Thus  P(x+y,x-y)  will 
match  P(A,A-4*B/,  binding  x to  A-2*B  and  y to  2*B.  P(x+y, x-y)  and  P(A,A-3#B)  will  not 
match,  however,  because  x and  y are  considered  to  be  integer  variables. 

Subspace  matching  is  a powerful  facility,  but  it  is  not  desirable  in  certain  instances.  Consider  the 
rule 

DIST:  REPLACE  (a+b)*c  BY  a*c  + b*c; 

Since  a and  b will  be  part  of  a subspace  specification,  the  BY  clause  will  not  be  instantiated, 
severely  limiting  applicability  of  the  rule.  For  this  reason,  a facility  has  been  provided  which 
allows  extra  specific  instances  to  be  generated  by  the  matcher  in  addition  to  the  subspace 
specifications.  This  facility  is  controllable  by  a switch  (called  SUMMATCH),  since  for  efficiency 
reasons,  it  may  not  always  be  desirable. 

In  some  cases,  it  may  be  necessary  to  eliminate  the  subspace  match  entirely.  If  we  were  simplifying 
P(A)  A P(3)  A p;b)  a P(C)  3 P(C+B) 
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with  the  rule 

SUMP:  INFER  P(x+y)  FROM  P(x)  a P(y); 

many  unnecessary  subgoals  would  be  generated.  When  the  rule  matches,  It  generates  a subspace 
specification  for  x+y-»(C+B)  In  Its  virtual  Instance.  FIND  Is  used  to  locate  Instances  of  P(x),  since 
x is  not  fully  specified.  The  first  instance  Is  P(A),  resulting  In  the  binding  of  x to  A and  y to 
C+B-A  by  solving  through.  By  this  process,  many  useless  goals  will  be  generated.  If  we  could 
somehow  guarantee  that  P(r)  would  be  instantiated,  we  would  not  have  this  problem.  One  way  to 
do  this  is  to  invent  a new  predicate  which  does  not  appear  in  the  theorems  to  be  proved. 
Suppose  we  replaced  SUMP  by 

SUMP2:  INFER  P(x+y)  FROM  INST(x)  A P(x)  a P(y); 

SUMAUX.  INFER  INST(x); 

Since  INST  does  not  appear  in  the  data  base,  it  can  only  be  proved  with  rules.  But  rules  only 
"fire"  on  Instantiated  literals,  so  FIND  will  always  fail  on  INST,  eliminating  the  subspace  matches. 
Thus,  only  the  specific  instances  (provided  as  a result  of  setting  the  switch  mentioned  above)  will 
be  considered.  This  combination  of  rules  guarantees  that  P(x)  (and  consequently  P(y))  In  SUMP2 
will  be  Instantiated. 


4.2.14  Efficiency  considerations 

The  user  is  reminded  that  the  theorem  prover  is  limited  in  its  capacity.  Rules  may  be  thought  of 
as  a device  for  programming  the  theorem  prover:  it  is  easy  to  write  Inefficient  programs  — harder 
to  write  efficient  ones.  Like  programs,  inefficient  rulefiles  cause  the  prover  to  use  excessive  time 
or  space,  running  until  either  the  patience  of  the  programmer  or  core  storage  is  exhausted.  This 
sort  of  Inefficiency  can  be  prevented  in  many  cases  by  merely  considering  efficiency  as  well  as 
logical  elegance  when  writing  rulefiles.  Remember,  however,  that  there  are  many  concepts  that  are 
difficult  to  code  effectively  as  rules. 

Beware  of  excess  searching  caused  by  badly  ordered  backward  rules.  When  writing  rulefiles, 
consider  how  to  order  the  rules  so  search  will  be  efficient.  Simply  reordering  rules  and  literals 
within  rules  can  lead  to  dramatic  decreases  in  proof  times. 

Beware  of  forward  rules  asserting  multitudes  of  useless  facts  and  causing  unnecessary  splits. 
Strengthen  FROM  clauses  to  restrict  application. 

Beware  of  rules  that  create  numerous  virtual  instances.  For  example, 

LOSS:  WHENEVER  F(x),  F(y)  FROM  F(x)-F(y)  INFER  P(x,y>; 

will  create  nt2  Instances  of  the  rule  if  there  are  n Instances  of  F(-)  In  the  data  base.  While  most 
of  the  virtual  Instances  may  not  fire,  their  presence  in  the  data  base  will  Increase  the  space 
required  for  proof. 
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Plan  carefully  whether  to  use  forward  or  backward  rules,  or  both,  to  express  a particular  concept. 
Forward  rules  are  most  effective  for  "complete"  domains,  where  all  relevant  facts  can  be 
propagated  immediately.  Examples  of  such  domains  are  simple  data  structures,  type  properties  of 
program  data  objects,  and  some  simple  arithmetic  facts.  Backward  rules  are  best  suited  to  larger 
''incomplete"  concepts,  where  forward  inference  would  produce  too  many  facts  or  could  not 
generate  all  relevant  facts.  Ordered  and  Permutation  are  examples  of  such  concepts. 

Adjust  the  depth  bounds  to  conservative  values  before  attempting  a large  proof.  Some  domains 
with  very  broad  search  spaces  need  shallow  bounds,  while  other  domains  which  require  narrow, 
deep  searching  need  to  have  the  bounds  set  accordingly.  Rules  which  require  broad  deep 
searches  will  be  inefficient;  it  may  be  advisable  to  rethink  their  structure. 

In  general,  the  best  advice  is  to  understand  what  a set  of  rules  means  from  both  the  heuristic  and 
logical  viewpoints.  Syntactically  translating  logical  statements  into  rules  without  regard  to 
efficiency  can  lead  to  prolonged  and  wasteful  searches. 


4.2.15  A note  on  multiplication 

The  built  in  Presburger  arithmetic  package  (which  is  independent  of  the  rulehandler  and 
semantic  matcher)  includes  a facility  for  recognizing  multiplication  by  constants.  However,  this 
facility  is  equivalent  to  the  set  of  rules: 

REPLACE!  -l#x  BY  -x 
REPLACE  0*x  BY  0 
REPLACE  l*x  BY  x 
REPLACE  2*x  BY  x+x 

and  so  on.  This  means  that  without  rules,  the  formula  P(x*y)  a x-0  a P(0)  will  not  simplify, 
while  the  formula  x»0  a P(x*y)  a P(0)  will  simplify.  This  unfortunate  weakness  can  often  be 
circumvented  partially  by  adding  rules  of  the  sort: 

REPLACE  x*y  WHERE  x-1  BY  y 
REPLACE  x*y  WHERE  x-2  BY  x*y 

and  so  on,  where  necessary. 
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Appendix  A 


Command  Syntax 


Alternate  notation 


assignment  «-  or  :■ 

greater  or  equal  i or  -> 

implication  sign  •+  or  -> 

or  v or  ! or  | 

negation  - or  ~ 

history  sequence  concatenation  • or  !!  or  || 


A.l  Command  syntax 

<command> 

— — ximperat  ive_command>— * i — » 

— ►<8etparm_command> 

— ►<  inf  or  mat  i on_coim«and>— 

— QUIT  

— * LISP  


less  or  equal 
not  equal 
and 

reference  class  extension 
reference  class  selection 


s or  <- 
* or  <> 
a or  8c 
u or  && 
c or  (\ , =>  or  \] 


< i mper at i ve_command> 

— — »<read_command> 

— »<pr int_command> — 

— *<s i mp_cowmand> 

— »< I oad_command> 

— Kdump_conmiand> 

— Kde I ete_command>— 


<se  tparm_coimnand> 

— — *<al  I a8_command> ► 

— »<eet_coimnand> 

— Kreset_commar»d>— 

— Kopen_command> — 
CLOSE  


< i n f or ma  t i on_command> 
* HELP 


u 


SHOU 


ident i f ier 
<par_name>- 


>J 


STATUS 


A-i 
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<readwcoimnand> 


T 


READ  »<f  1 le-name>- 

READVC 


<pr lnt_command> 
— ► PR I NT VC  -r- 


;to_f I le_na» 


►<swi  tcbes>— J «— ►<vc— speo- 


<8 i mp_command> 


t: 


SIMPLIFY  1 

RESIMPLIFY  — ^ * — +<eu I tches>— ^ L »<vc_8pec>- 


< I oad_command> 


LOAD  — »<short_f i le_name>- 

LOADVC  »<f  i le_name>- 

LOADRULE  -J 


<dump_command> 


DUMP  ■*<shor  t_f  i I e_name>- 

DUMPVC  ><f  i le_name> p— 


*<vc_epec>- 


DUMPRULE  — ►< f i I e name>- 


, — »<r  u I e_f 1 1 e_nana>-^ 


<d« I •t«_connand> 


OELRFILE 


OELNULE 


*<rule_f I le_name>- 


<rule_na*e>- 


Command  Syntax 


i 


I 

k 
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Command  Syntax 


<a 1 1 aa_coMMand> 
— ► ALIAS  -i 


<identlf ler>-  ■ xidentlf ier>- 


*<nu«ber> 


i er>—  »< 

_rc 


nu«iber>- 


<eet_command> 

—*  SET  — »<par_eet t inga>- 

<reset_cominand> 

— ► RESET  »<par_nawe>- 


<open_command> 

— ► OPENFILE  — ►<  f i I e_na»e>- 
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<to  file  name> 


C‘  TO  »<conplete_f i le_spec>-J 

■»  —I 


<f i I e_name> 


*<comp I e te_f i I e_epec>- 


< short  file  name> 


T 

*< ident i f ier>— ' 


<complete_f  i le_spec>  is  the  standard  monitor  syntax  for  a file  name. 


<vc_spec> 


►<  i dent i f i er>- 


»<ni!Biber> 


<rule_name>,  <rule_f i le_name> 
►<  ident  i f ier>— » 


<sui tche9> 

— » ( — ♦<par_sett ings>— * ) 


<par_eettings> 

><par_name>- 


♦<par_va  I ue>- 


<par_name>,  <par_va  lue>  refer  to  the  list  of  System  Parameters  given  In  Part  II,  Section  2.2. 
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Verifier  Syntax 


A.2-1  Outer  level  of  input 

<source  input> 


*<rulea>— ^ ’ I— *<Pascal  progran>- 


<rulee> 


RULEFILE  -♦  ( — »<identi  f ier>— * ) 


crule-dec>- 


<ru la-stmt >- 


<rule-dec> 


1 


CONSTANT 
CONST  — 


PATTERN 


♦<identi f ler>  ■ » i 


<rule-stmt> 


^ — Kbackuardstatement>- 

t 

*<ru I e_l abe I >— 1 — *<rep I aces tatement>— 


♦< for uar data tement>- 


<rule_label> 

— -*<ldent  I f ler> — ► i — * 


<Pascal  program> 


PASCAL 


*<declaratlons>-J  l— *<»ain  block>- 


Appendix  A 


Verifier  Syntax 


A. 2.2  Statements  that  appear  in  rulefiles 


<backwardetatement> 

— ► INFER  — »<r_con junct ion>- 


♦<r_from_par t>— i l— ►<r_uhenever _part>- 


<repl ace9tatement> 

— ► REPLACE  — »<r_relational>- 


BY  — »<r_caae_exp>- 


»<r_where  _parts 


<foruardstatement> 


*<r_whenever_par  t>- 


<r_from_part> 

— ► FROM  — »<r_expres9ion>- 


<r_from_par  t>— — — * INFER  — Kr_case_exp>— ♦ 


»<r_from_part> 


<r_whenever_par  t> 
— ► WHENEVER  >< 


*<r_expre88ion>- 


<r_uhere_part> 

— ► WHERE  ►<r_expre98ion>- 
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i 


A. 2 3 Expressions  that  appear  in  rulefiles 
<r_case_exp> 

— I — »<r_expre»9ion> 

' — ► CASES  — *<r_expre88ion>- 


— ► •* 


*<r_ex  pre88lon>- 


ENO 


J 


<r_expre88ion> 

— xr _d i e junc  t i on>- 

<r_dl s junct ion> 

— Kr_con junct lon>- 


u7 


*<r_di a junct lon> 


J 


n: 


♦<r_con junctions 


T* 


<r_con junct ion> 

— »<r_no  t_expr ess i on>- 


c 


n — »<r_not_expres8ion> 


J1 


<r_no t_expre38 i on> 

-»<r  re  I at  Iona  I >- 


UTT 


<r_relational> 

— »<r_8 i mp I e_expr 898 i on>> 


<r  _r  e I op>— * <r_s  i sip  I e_expr 888  i on> 


J 


A-7 


- 
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<r_relop> 


<r_s i mp I e_expr ess i on> 
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<r_var lable> 

*<r_hl etexp>- 


X 


<r_data_tr iple> 


3X7 


♦cldenti f ier> 


TE 


<r_ modfn> 


TXT 

t 


_select> 


<r_modfn> 

— ► . — »<  i dent  i f i er>— * ( — *<r_exl lBt>— * ) — ► 


<r_9e I ect> 

— — ► . —»< ident i f ier> 

— ► c — »<r_var  i ab  I e>— » o — 

— * j — »<  ident I f ier> 

— * t — »<r_exl  i s t >— * 1 


<r_hi stexp> 


►< ident I f ler> 

L-»  H -J 

• *— 


— ► ( — »<r_exl iet>— * ) 


<r_exl i a t > 

><r_expre98lon> 


<r_data_tr lple> 

— ► < — *<r_var iable> 


'X 


<identl f ier> 
<r  select > 


r 


-+<r_tr  ipxp>— ♦ > — * 


<r_trlpxp>  is  the  same  as  <r_expreesion>  except  that  at  the  level  of  r_relop,  the  relational 
operator  ">"  is  omitted.  This  has  the  effect  that  expressions  containing  this  operator  must  be 
enclosed  in  parentheses  when  appearing  In  the  final  portion  of  a data  triple.  It  is  required  to 
eliminate  ambiguities  caused  by  using  > to  terminate  a data  triple. 
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A. 2.4  Outer  structure  of  Pascal  programs 

<declarat ions> 

— Kspec  i f I cat  ions>- 


J 

— ►<proc8>-J 


<spec i f i cat ions> 


— ►< label  deo— ^ I — *< 


: I abet  deo— I I— »<const  dec>-J  i — -»<decl-8tmt8> 

T 


<main  block> 

— »<  in-out  assert ions>—»  BEGIN  — *<compound  tail >- 


A. 2.5  Nonexecutable  statements 
< I abet  deo 


— ► LABEL 


cconet  deo 

♦ CONSTANT 


CONST 


Jl 


♦cldent i f ier>- 


r<signed  number > r I 

<etr  ing> 1 


<decl-stmts> 

— — i — Ktype  deo 

— »<var  deo 

— Kmodule  deo 

— Kscheduler  deo— 
— xcreate  deo 


A-IO 
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<proc8> 


A. 2.6  Procedure  declarations  and  associated  assertions 


<proc  dec> 

— » PROCEGURE 

<fun  dec> 


— ► FUNCTION  — *<identif  ier>-*<fun  para«is>-*  « -+<Pa8cal  type>— » » — »<fun  assort  I ona>— ► 
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<global s> 

— ► GLOBAL  -*  ( 


L»  VAR  -J 

— r -Kidentif  ier>-* 

ITTj 

j 4 

) — i 


<fun  assert ions> 


U 


fun  globals> 


J 


-*<in-out  assertions>- 


<fun  globals> 
► GLOBAL  ->  ( 


»< t dent i f i er>— 

J 

| < 

> — I 


< in-out  as8ertions> 


I — ►< initial  stmt>  ^ — xentry  stmt>-J 


*<exit  stmt>- 


<inltial  stmt> 
— ► INITIAL  — 


L7j 


■+<ident  i f ier>— * = 


L»  # 


xldenti f ier>- 


<entry  stmt> 

— ► ENTRY  — Ka_expression>— * | — » 
<exit  stmt> 

— ► EXIT  — *<a_expression>— * i — » 
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A.2.7  Module  and  Scheduler  declaration 

•emodule  dec> 

— ► MODULE  — *<identi  f ier>— ♦ i -Kvieible  part>-i— »<module  Invisible  part> ► i 

EXTERNAL  


*-»  EXTERN 


<module  Invisible  part> 
— ► INVISIBLE 


I 


-xlnvle-lble  part>- 


'SCHEOULED  BY-*<identif ier>-* 


l -l 

— Kcond  dec>-^ 


<echeduler  d ec> 

— ► SCHEOULER  — « Identifiers  | -*<eched  visible  par t>-i—*< Invisible  part> ► | 

1— ► EXTERNAL  1 


L-*  EXTERN 


<ached  visible  part> 


U 


RECEIVES  — ♦<  I dent  I f I er>— * i 


J 


♦<v I eible  part>- 


<visible  part> 
— ► VISIBLE 


c 


-Kvieible  Ite*  dec>- 


<basstype  dec>- 


— *<axioi«  dec>-J 


< invisible  part> 

— ► INVISIBLE  — xspecl  f Icat  ions>— xboundar  ies>- 


— »<procs>-^ 


-»<eod  Ini  t>— * END 
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Verifier  Syntax 


<ba9etype  dec> 


<v I Bible  I tern  dec> 


<vi aible  proc  dec> 


— » PROCEDURE 


— ►<  i 


denti f ler> 


<vi Bible  fun  dec> 

— » FUNCTION  — *<identi  f Jer>— »<fun  paraB8>--*  i —*<Paecal  type*— » 


<axiom  dec> 
— ► AXIOMS  - 


*<r_expre89lon>—»  j 


FOR  ALL  ( — *<axiom-Bpec>— ► I 


<ax i om-8pec> 


A - 14 
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Verifier  Syntax 


<boundar ies> 

— ►< invar  I ant  at 

< invariant  8tmt> 

— » INVARIANT  — »<a_expre8sion>— » i — » 


<invi s-basetype> 


<mod  init> 

— ► BEGIN  — *<compound  ta i ! > — » t — * 


A.2.8  Module  and  condition  variable  instantiation 
<create  dec> 


A- 1 5 


k 
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A.2.9  Pascal  type  declarations 
<Pascal  type> 

■*<8 Imp  I e type>- 
■*<array  type>— 

■*<record  type> — 1 
♦<pointer  type>-| 

*<unlon  type>— 

*■< f i I e type> — 

<si»ple  type> 

♦<  I dent  I 1 

*<ldent i f ier>— * . — *<ident I f ier>- 


rc 


< ident i f ier>- 


* —i — »<identl  f ier>- 


<signed  number > 

( »<identif ier> 


Ksigned  number > 


J 


r‘ 


<array  type> 


ARRAY  -*  [ — — *< limited  simple  type>-r-*  ] -»  OF  -*<Paec8 


t 


r 


<• imi ted  simple  type> 
*< ident I f ler> 


< ident if ier>- 
<eigned  number > 


« - | »< ident i fler>- 


T'TZ 


gned  number >• 


<signed  number > 

»<number>— » 


A - 16 


Verifier  Syntax 


type>— » 


Appendix  A 


Verifier  Syntax 


crecord  type> 


cun  ion  type> 


i — »<Paecal  type>- 


ENO  -♦ 


UNION 


ridentlf  Ier>— » r -*cPaecal  type>-r—»  END  — * 


r 


cpolnter  type> 

— ► t — *<identl f ier>— * 


<f i le  type> 

— ► FILE  — * OF  — ♦< file  Pascal  type>— -* 


cfile  Pascal  type>  is  the  same  as  cPascal  type>  except  it  does  not  contain  cpolnter 
type>. 
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A.2.10  Executable  statements 
<compound  talt> 


7JZ 


t: 


<number>— » 


<atatement> 


J 


ENO  -* 


<statement> 
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<for  statement 

— ► FOR  — ■»<  I den  1 1 f i er  > ■ j " -»  *•  — — ^expression*— r— » TO  xexpression*- 

i-  J L OOUNTO  -J 


INVARIANT  — »<a„expre8elon>-*  00  — ^statement*— * 

<repeat  statement* 

— ► REPEAT  — — »<statement*-i— * UNTIL  -*<expres«ion>-*  INVARIANT  — ♦<8_expression>- 


<ca9e  statement* 

— * CASE  — *<expression>— * OF 


1 — xnumber* 

' — Kidentl  f ler*~J 

— * i — *<statement>— 

— 

A.2.11  Expressions  in  Pascal  programs 

expression* 

— Keimple  expression*— 


*<r_relop*— *<simple  express  I on>- 


<simple  expression* 
— »<  term*—i 

-.4  1 


d 


<term* 

-—♦<  factor* 
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<f actor> 

— — ►<number> 

— Kstr  lng> 

— ► ( — »<expreasion>-*  ) — 

— ► NOT  — ><factor> 

— »<var i ab I e>  — 


<var iable> 

- — xident i f ier>- 


U7 


*<expresslon>-i — » ) 


~j — ► ) ><poetap> 


l 


<postap> 


♦<identif ier>- 


: — *<ident i f ler>- 


u 


( — *<expre8sion>- 


, J 


( Kexpres8lon>— . — ► ) 

t . __r 


<a_expre88ion>  is  the  same  as  <r_expres8ion>  with  the  following  changes— A union  selection 
t < ident  1 f ier>  may  be  followed  by  an  expression  In  parentheses;  this  permits  the  parser  to 
automatically  build  the  union  construction,  as  in  executable  statements.  The  history  sequence 
operator  • Is  prohibited;  record  fields  Indicated  by  a period  (.)  may  not  have  a parameter  list 
following  the  fieldname.  These  restrictions  have  the  effect  of  prohibiting  module  history  sequence 
statements. 

<number>  is  an  unsigned  constant. 

<str  i ng>  is  a character  string. 

< I den  1 1 f i er  > Is  a sequence  of  letters  and  digits,  starting  with  a letter. 


A-20 


Appendix  B 


Parser  Error  Messages 


B.l  General 

The  parser  makes  a pass  over  the  source  code  you  have  provided  for  correct  syntax.  If  this 
results  in  no  error,  the  message  "SYNTAX  SCAN  COMPLETE*  is  given.  If  an  error  occurs,  the 
parser  will  tell  you  what  it  was  scanning,  what  would  have  been  an  acceptable  next  token,  and 
what  some  previous  tokens  were. 

This  initial  syntax  scan  merely  verifies  that  the  format  of  what  is  seen  is  correct;  it  makes  no 
checks  on  the  actual  content.  If  this  syntax  scan  is  satisfactory,  a second  phase  is  entered  where 
content  checks  are  made.  What  follows  is  a list  of  errors  that  can  occur  during  this  second  or 
semantic  phase.  If  this  second  phase  is  completed  successfully,  then  whatever  action  the  parser 
was  trying  for  you  is  then  done.  Note  that  when  parsing  Pascal  code,  verification  conditions  for 
procedures  and  functions  which  were  completely  parsed  prior  to  a semantic  error  will  be  present 
and  can  be  still  worked  on  with  the  simplifier. 

The  following  listing  is  In  alphabetic  order.  The  notation  "vcg"  following  a message  indicates  that 
the  source  of  the  error  is  the  verification  condition  generator  rather  than  the  parser.  This  should 
not  normally  be  of  concern  to  a user. 


B.2  Semantic  errors 

ACTUAL  PARAMETER  TYPE  DOES  NOT  MATCH  FORMAL  DECLARATION 

The  parser  checks  procedure  and  function  calls  to  ensure  that  the  type  of  each  parameter  matches 
the  declaration  of  that  procedure  or  function.  One  of  yours  didn’t  make  it.  Information  printed 
out  may  include  the  type  expected  or  the  name  of  the  formal  parameter  In  the  declaration. 


ARGUMENT  LIST  EXPECTED 

A function  name  appeared  in  an  expression  and  It  was  not  followed  by  an  argument  list  enclosed 
In  parentheses. 


BAD  PUT  ENTRY-VERIFIER  ERROR 

A n Internal  check  In  the  parser  symbol  table  entry  code  has  discovered  something  that  shouldn’t 
be  there.  If  this  was  a program  product  of  some  manufacturer,  you’d  be  instructed  at  this  point  to 
send  In  a trouble  report.  As  it  Is,  the  choices  are  less  appealing!  In  any  case,  It  would  be  bad  to 
trust  anything  produced  by  the  parser  after  getting  this  error. 
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BASE  TYPE  FOR  POINTER  NOT  DEFINED 

In  an  assertion  within  a Pascal  program,  you  used  the  notation  ’^identifiers".  To  correctly 
translate  this  into  an  assertion  the  system  understands,  the  parser  has  to  be  able  to  figure  out  what 
reference  class  the  <identifier>  belongs  to.  It  does  this  by  looking  up  the  entity  in  its  symbol  table, 
and  In  this  case  it  couldn’t  find  it.  If  you  want  to  Include  this  as  part  of  the  assertion,  you  will 
have  to  provide  the  reference  class.  Instead  of  this  syntax,  use  *<base  type>  e <identifier>  a (no 
blanks  between  • and  <base  typo). 


BASE  TYPE  FOR  REFERENCE  CLASS  DOES  NOT  MATCH  WHAT  WAS  EXPECTED 

In  an  assertion  within  a Pascal  program,  you  used  the  notation  "*<identifierl>  c <identlfier2>  a" 
(or  some  qualified  form  equivalent  to  this).  Either  <identlfler2>  was  not  of  pointer  type,  or  If  it 
was  of  pointer  type,  Its  base  type  was  not  the  same  as  identifier  !>. 


BOOLEAN  EXPRESSION  EXPECTED 

An  expression  of  boolean  type  was  expected,  such  as  in  a WHILE  test  or  an  IF  test. 


BOTH  SIDES  OF  ASSIGNMENT  MUST  BE  COMPATIBLE  TYPES 

For  an  assignment  statement  to  be  correct,  the  types  of  the  entity  being  stored  Into  and  the  type  of 
the  expression  being  stored  must  be  compatible.  Thus,  they  must  both  be  numbers,  or  one  must 
be  a subset  of  the  other,  or  they  must  be  the  same  type.  You  had  an  assignment  statement  where 
this  was  not  the  case. 


BRANCHING  INTO  COMPOUND  STATEMENTS  PROHIBITED 

You  may  not  branch  Into  a WHILE,  REPEAT,  FOR.  or  WITH  body  using  the  GOTO 
statement.  If  you  need  unlimited  branching,  you  will  have  to  t'iate  your  control  structure  entirely 
with  GOTO  not  using  any  of  these  iteration  statements. 


CASE  NAME  TYPE  MUST  MATCH  CASE  EXPRESSION 

At  the  head  of  a CASE  statement  is  an  expression  of  a certain  type.  Each  of  the  cases  following 
must  be  identified  with  a constant  of  the  same  type. 

CHAR  TYPE  MAY  ONLY  HAVE  ONE  CHARACTER  STRINGS 

An  entity  of  type  CHAR  may  be  a string  at  most  one  character  long.  Longer  strings  will  be 
allowed  eventually. 
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CLASS  NAME  INCORRECTLY  QUALIFIED  OR  USED 

A class  name  must  be  followed  by  a period  and  another  Identifier  when  Invoking  a procedure  or 
function  from  the  class  externally.  Alternatively,  you  tried  to  assign  to  a class  procedure  name  or 
function. 


CONSTANT  DUPLICATED  IN  THIS  TYPE  DEFINITION 

A union  type  consists  of  ids  followed  by  types;  each  of  these  ids  must  be  distinct  within  a given 
type  definition.  You  duplicated  one  of  the  ids. 


CONSTANT  MAY  NOT  BE  QUALIFIED 

You  have  an  Identifier  which  was  given  a value  in  a CONST  or  CONSTANT  statement.  These 
identifiers  have  the  value  you  gave  substituted  by  the  parser,  thus  they  are  really  parse  time 
abbreviations.  In  particular,  they  are  always  scalars  and  can’t  be  subscripted,  or  have  record 
fields,  etc.  following  them. 


CONSTANT  MAY  NOT  BE  STORED  INTO 

You  have  an  identifier  which  was  given  a value  in  a CONST  or  CONSTANT  statement.  These 
identifiers  have  the  value  you  gave  substituted  by  the  parser;  thus  they  are  really  parse  time 
abbreviations.  Therefore,  you  can’t  store  into  them  --  put  them  on  the  left  hand  side  of  an 
assignment  statement  except  as  part  of  a subscript  or  something  like  that. 


CONSTANT  OF  A KNOWN  ENUMERATED  TYPE  EXPECTED 

Each  union  type  consists  of  a list  of  Id-type  pairs.  Each  id  must  be  a constant  of  the  same 
enumerated  t"pe.  You  have  given  an  id  which  is  not  a constant  of  an  enumerated  type. 


CONSTANT  TYPE  DIFFERS  FROM  PREVIOUS  CONSTANTS 

Each  union  type  consists  of  a list  of  Id-type  pairs.  All  the  ids  must  be  constants  of  the  same 
enumerated  type.  You  have  given  an  id  of  a different  type  than  previously  encountered  in  this 
declaration. 


DUPLICATE  LABEL  IN  CASE  STATEMENT 

The  same  label  appears  twice  in  a case  statement.  Each  case  must  appear  at  most  once. 
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EMPTY  CASE  STATEMENT  (vcg) 

This  message  should  not  be  printed  under  any  circumstances.  If  it  does  occur,  it  indicates  that  the 
parser  has  produced  a case  statement  with  no  branches. 


ERROR  IN  ASSIGNMENT  STATEMENT  (vcg) 

This  message  should  not  be  printed  under  any  circumstances.  If  it  does  occur,  it  indicates  that  the 
parser  has  produced  an  illegal  assignment  statement. 

ERROR  IN  C.D.U  - CASE  I 
Caused  by  forgetting  to  set  _ to  T. 


ERROR  IN  C.D.U  - CASE  2 
Caused  by  forgetting  to  set  _ to  NIL. 


ERROR  IN  C.D.U  - THIRD  TYPE 

Now  you’ve  really  done  It.  You  were  warned  NOT  to  use  the  CONCURRENT  Dynamic 
Underbar  feature  UNLESS  you  talked  to  me  first.  Now  that  you  are  having  trouble,  don't  expect 
me  always  to  solve  YOUR  problem. 

This  message  MIGHT  also  be  caused  by  incompleteness  In  the  W matcher,  so  be  sure  to  send  a 
complete  minimal  protocol  to  BUG-VERIFY  % STANFORD,  zip  code  94305,  and  allow  at  least 
nine  months  for  delivery.. 


EXIT  ASSERTION  OMITTED  FROM  PROCEDURE  OR  FUNCTION 

An  EXIT  assertion  is  required  by  the  system.  The  absence  of  one  Is  usually  detected  by  the 
syntax  scan.  But  when  the  word  PROCEDURE  or  FUNCTION  followed  by  Just  a name  is 
found,  the  syntax  scan  must  permit  it  since  it  could  be  the  body  of  a block  declared  forward.  If  it 
Isn’t,  this  error  is  given. 


FILES  CANNOT  APPEAR  IN  ASSIGNMENT  STATEMENTS 

You  tried  to  assign  to  an  identifier  of  type  FILE.  Files  may  appear  only  In  assertions,  READ 
statements,  and  WRITE  statements  (In  addition  to  being  declared). 
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FILES  OF  ENTITIES  OF  POINTER  TYPE  ARE  PROHIBITED 
The  base  type  of  a file  may  not  be  of  type  POINTER. 


FOR  CONTROL  VARIABLE  MAY  NOT  BE  REDEFINED  INSIDE  FOR  STMT 

Pascal  prohibits  redefining  the  FOR  statement  control  variable  within  its  loop.  Note  that  this 
error  can  occur  when  the  control  variable  is  passed  to  a procedure  which  may  change  it  — i.e.,  as 
a VAR  parameter  or  when  it  is  declared  as  GLOBAL  within  the  called  procedure.  It  can  also 
occur  in  more  obvious  ways. 


FUNCTION  NAME  MAY  NOT  BE  USED  AS  VARIABLE  (vcg) 

You  have  a function  or  predicate  name  appearing  in  an  assertion  or  code  which  is  also  declared 
as  the  name  of  a variable.  This  is  not  permitted. 


FUNCTIONS  MAY  NOT  HAVE  SIDE  EFFECTS-STRICT  ENFORCEMENT 

In  order  to  permit  only  functions  without  side  effects,  the  parser  is  extremely  rigid  in  disallowing 
things.  In  particular:  function  bodies  may  not  contain  global  statements,  IO  statements,  or  NEW 
statements.  In  addition,  functions  may  not  have  VAR  parameters.  This  rather  severly  limits 
functions!  You  may  have  to  make  your  function  into  a procedure  which  returns  its  value  as  a 
VAR  parameter.  Sorry! 


CENSYM  AND  YOU  AGREE— SORRY!— RENAME  YOUR  VARIABLE 

When  the  parser  called  the  LISP  function  GENSYM  to  Invent  a name  for  some  reason  or 
another,  the  name  returned  was  already  in  your  program,  declared  as  one  of  your  entities  in  this 
block.  You  must  change  the  name  of  the  entity  of  that  type.  This  message  will  usually  be  given 
in  addition  to  an  IDENTIFIER  DECLARED  MULTIPLY  message. 


GLOBALS  FROM  OUTSIDE  THE  MODULE  MUST  APPEAR  IN  VISIBLE  GLOBAL 
STMT 

Module  visible  procedures  may  have  two  global  statements:  one,  appearing  with  the  visible 
declaration,  describes  the  entities  global  to  the  module  that  the  procedure  might  change.  The 
second,  attached  to  the  invisible  declaration  of  the  procedure,  details  the  module  variables 
changed  by  this  procedure. 
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IDENTIFIER  DECLARED  MULTIPLY  IN  ONE  BLOCK 

This  particular  identifier  is  already  the  name  of  something  In  this  block.  Change  one  or  the 
other. 


ID  IN  POINTER  OF  INCORRECT  TYPE 

When  defining  a pointer  type,  the  pointer  base  type  must  be  another  type  identifier.  Since  the 
base  type  for  a pointer  type  may  appear  before  it  is  defined,  this  error  may  not  appear  until  after 
processing  all  TYPE  statements  for  a particular  function  or  procedure. 


ID  NOT  DECLARED  OR  NOT  A VARIABLE 

In  processing  an  expression,  the  parser  found  an  identifier  that  was  not  in  the  symbol  table;  or  if 
it  was,  it  was  not  declared  as  a VAR  but  rather  was  of  some  other  kind.  This  error  can  occur,  for 
example,  if  a virtual  variable  appears  in  executable  code  (other  than  documentation  or  a 
PASSIVE  statement). 


ID  NOT  DECLARED  AS  VISIBLE  BASETYPE  NAME 

In  the  BASETYPE  specification  within  the  invisible  part  of  a module,  you  tried  to  declare  the 
specifications  of  an  identifier  that  was  not  declared  as  the  name  of  a basetype  in  the  visible 
specifications. 


ILLEGAL  ENTRY  ASSERTION  FOR  FUNCTION  (vcg) 

The  ENTRY  assertion  for  a function  may  not  contain  the  function  name. 


ILLEGAL  PROCEDURE  CALL  (vcg) 

The  procedure  call  rule  requires  that  each  of  the  VAR  parameters  and  GLOBAL  ■variables  In  a 
particular  procedure  call  refer  to  a distinct  variable. 


IMPROPER  SUBRANGE  DEFINITION 

Subranges  may  be  declared  as  explicit  types  or  as  subscripts  for  arrays.  They  are  usually  two 
values,  in  which  case  the  lower  value  of  the  subrange  must  really  appear  before  the  upper  value 
in  the  deflntion  of  the  base  type.  In  particular,  for  subranges  of  integers,  the  first  integer  must  be 
smaller  than  the  second.  Also  the  types  of  the  two  entities  in  the  subrange  must  oe  compatible 
with  each  other. 
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INTERNAL  ERROR  IN  VCG  OR  RUNCHECK  (vcg) 

This  message  is  produced  only  in  the  special  runtime  error  checking  version  of  the  verifier.  It 
indicates  a system  error  in  the  verifier. 


INVALID  ARGUMENT  TYPE  TO  ARITHMETIC  OR  LOGICAL  OPERATOR 

The  parser  checks  that  each  arithmetic  or  logical  operator  only  receives  sub-expressions  of  proper 
type;  thus  * expects  only  to  find  two  numbers,  NOT  a boolean,  etc. 


INVALID  CONSTRUCTOR  OR  SELECTOR  FOR  UNION  TYPE 

Union  type  construction  must  have  three  elements:  the  type  to  be  constructed,  the  tag  to  be 
associated  with  it,  and  the  value  to  be  associated  with  it,  in  that  order.  The  type  of  the  value 
must  be  consistent  with  the  tag,  and  the  value  must  be  present.  Therefore,  there  must  be  an 
expression  enclosed  in  parenthesis  of  the  appropriate  type,  and  there  must  be  a tag  of  the 
appropriate  type.  Union  selection,  however,  merely  consists  of  a union  variable  followed  by 
selection  of  a union  field.  No  expression  may  follow. 


INVALID  SUBRANGE  ITEM 

Subranges  may  be  declared  as  explicit  types  or  as  subscripts  for  arrays.  In  the  latter  case  only,  a 
VAR  is  permitted.  In  both  cases,  a number,  an  abbreviation  for  a number  (Identifier  defined  in 
a CONST  or  CONSTANT  statement),  or  a constant  of  an  enumerated  type  may  be  used.  None 
of  these  types  of  entities  were  found  in  your  definition. 


INVALID  TYPE  FOR  CASE  STATEMENT  EXPRESSION 

The  expression  following  the  keyword  CASE  must  be  of  scalar  type.  Further,  it  may  not  be  of 
type  REAL  or  a subset  of  type  REAL. 


INVALID  TYPE  FOR  CONTEXT  WHERE  USED 

An  attempt  was  made  to  dereference  (t)  an  entity  not  of  type  pointer,  or  subscript  an  entity  not  of 
type  array.  Alternatively,  in  a FOR  statement,  the  Index  variable  and  both  expressions  must  be 
compatible  with  a numeric  type.  Finally,  too  many  subscripts  were  present  for  a particular  var 
(i.e.,  there  were  two  subscripts  to  an  array  which  only  had  one  dimension,  or  one  subscript  to  a 
var  that  was  not  an  array). 
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KNOWN  TYPE  NAME  EXPECTED 

When  you  define  a type  in  terms  of  another  type,  the  second  type  must  already  be  known  to  the 
parser  (exception:  pointer  base  types).  Also,  the  base  type  for  a reference  class  appearing  In  a 
GLOBAL  statement  must  be  known  to  the  parser.  Finally,  the  type  of  a parameter  to  a function 
or  procedure  must  be  known  to  the  parser  before  seeing  the  function  or  procedure  declaration. 


LABEL  APPEARS  IN  PROGRAM  BUT  IS  NOT  DECLARED 

You  use  a label  in  your  program  unit  but  do  not  declare  it  entering  the  program  unit  (with  the 
LABEL  declaration).  Labels  must  be  local  to  the  procedure  In  which  they  appear,  and  must  be 
declared  there. 


LABEL  DECLARED  AND  REFERENCED  BUT  NOT  PRESENT 

Somewhere  In  your  procedure  or  function,  you  have  stated  GOTO  n,  but  after  completing 
parsing  your  procedure  or  function,  the  label  n was  not  found  on  any  statement  within  that 
procedure  or  function.  Note  that  If  n Is  within  the  body  of  a nested  procedure  or  function,  it  is 
not  regarded  as  being  within  the  body  of  the  outer  procedure  or  function. 


LABEL  MUST  BE  POSITIVE  INTEGER 
A label  must  be  a positive  integer;  it  cannot  be  lero. 

LABEL  SPECIFIED  MULTIPLY  IN  ONE  PROGRAM  UNIT 

The  same  label  appears  on  two  or  more  statements  In  one  procedure  or  function. 


MISSING  ASSERTION  ON  PATH  THROUGH  LABEL  (vcg) 

The  program  contains  a closed  path  formed  by  a GOTO,  but  there  is  no  assertion  anywhere  in 
the  path. 


MISSING  ITEM  IN  LOADSYMBOLS  INVOCATION 

The  LOADSYMBOLS  command  contains  two  parameters.  The  first  Is  the  name  of  the 
procedure  whose  symbol  table  environment  is  being  recreated;  the  second  is  the  name  of  the  file 
containing  the  symbol  table  code.  One  of  these  was  missing. 
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MULTIPLY  DEFINED  IDENTIFER  IN  INVALID  CONTEXT 

Contrary  to  the  usual  scope  rules,  once  an  entity  is  defined  to  be  a TYPE,  MODULE,  or 
SCHEDULER  Identifier,  it  may  not  be  redefined  as  a TYPE,  MODULE,  or  SCHEDULER 
identifier  in  a less  global  scope.  Give  the  new  type,  module,  or  scheduler  another  name. 

NAME  OF  MODULE  EXPECTED 

In  a CREATE  statement,  the  identifier  following  the  colon  must  be  the  name  of  an  entity 
previously  defined  as  a MODULE. 

NEW  STATEMENT  MUST  HAVE  POINTER  ARGUMENT 
A NEW  statement  can  only  initialize  an  entity  of  type  pointer. 

NUMBER  OF  PARAMETERS  IN  CALL  DOES  NOT  MATCH  DECLARATION 

A procedure  or  function  may  be  executed  by  being  called  only  with  exactly  as  many  parameters  as 
it  was  declared  with. 


PARAMETER  LIST  NOT  PRECEDED  BY  FUNCTION  NAME 

While  processing  an  expression,  a parameter  list  (a  list  of  expressions  enclosed  in  parentheses)  was 
found.  However,  the  entity  preceding  the  parameter  list  was  not  of  type  function. 


PATTERN  VARIABLES  MAY  NOT  BE  PREDICATE  OR  FUNCTION  SYMBOLS 


A variable  name  appearing  in  a PATTERN  statement  in  a ruSefile  was  found  in  a context  where 


it  would  make  a predicate  or  function  into  a pattern.  This  is  a second-order  match,  and  is 
prohibited  by  the  prover.  Rulefiie  predicates  and  functions  must  be  constants;  they  cannot  be 
instantiated  In  the  prover. 


PROCEDURE  NAME  EXPECTED 

You  had  a statement  which  looked  like  a procedure  call,  but  the  entity  that  should  be  the  name  of 
the  procedure  was  not  found  or  was  declared  to  be  something  else. 
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PROCEDURE  OR  FUNCTION  DECLARED  FORWARD  AND  NOT  FOUND 

You  declared  a function  or  procedure  to  be  FORWARD  and  then  didn’t  provide  the  body  of  the 
function  or  procedure.  If  you  just  want  to  specify  the  properties  of  a function  or  procedure 
without  specifying  its  body,  use  EXTERNAL  or  EXTERN  instead  of  FORWARD. 


PROCEDURE  OR  FUNCTION  DELCARED  FORWARD  AND  RESPECIFIED 

When  the  body  of  a procedure  or  function  declared  forward  appears,  the  parameter  list,  type, 
initial,  entry,  exit,  and  global  portions  are  not  duplicated.  The  format  is  "PROCEDURE  or 
FUNCTION  <identifier>  ; <bk>ck>". 


RECORD  FIELD  MODIFIES  ENTITY  NOT  OF  TYPE  RECORD 

Following  an  entity,  the  notation  ",<ldentlfler>"  was  found,  as  if  the  entity  was  a record  of  which  a 
particular  field  was  being  selected.  However,  the  entity  being  so  modified  was  not  of  type 
RECORD. 


RECORD  FIELD  NAME  MAY  NOT  BE  USED  AS  VARIABLE  (vcg) 

You  have  used  a record  field  name  that  is  the  same  as  the  name  of  a variable  in  your  program. 
This  is  not  permitted. 


REFERENCE  CLASS  EXPECTED 

Processing  an  assertion  in  Pascal  code,  a term  of  the  form  "•<identlflerl>  c <identlfier2>  =»"  was 
found.  The  entity  •<identifieri>  was  not  the  name  of  a reference  class  known  by  the  parser.  You 
need  to  declare  a type  that  is  a t <identifierl>  to  get  the  reference  class. 


SCHEDULER  MAY  NOT  BE  SCHEDULED 

A scheduler  is  used  to  control  access  to  modules,  and  is  assumed  to  run  In  a hardware  mutual 
exclusion  state.  As  such,  to  have  a scheduler  for  a scheduler  is  a built-in  deadlock.  Therefore,  a 
syntax  error  is  given. 


SCHEDULER  NOT  DECLARED  OR  OF  WRONG  TYPE 

A scheduler  for  a module  must  be  of  type  scheduler.  Your  name  wasn’t.  Alternatively,  you  tried 
to  enter  some  condition  variables  and  didn't  have  a scheduler  with  a RECEIVES  field 
(concurrent  version  only). 
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SUBSCRIPT  TYPE  DOES  NOT  MATCH  SUBSCRIPT  DECLARATION 

Each  subscript  of  an  array  must  be  of  a compatible  type  with  the  declaration  of  that  array.  Your 
use  of  an  array  did  not  match  on  one  or  more  of  its  subscripts.  The  printout  may  tell  you  the 
type  expected. 


SUBSCRIPT  TYPE  MUST  BE  SCALAR 

When  defining  an  array  type,  the  type  of  each  subscript  may  not  be  a record,  pointer,  array,  or 
file. 


SYMBOL  TABLE  TOO  OLD  - PLEASE  RECREATE  IT 

The  LOADSYMBOLS  and  DUMPSYMBOLS  operations  have  an  internal  check  which  make 
sure  that  they  are  consistently  used.  You  have  tried  to  do  a LOADSYMBOLS  operation  using  a 
file  that  was  created  too  long  ago  — there  has  been  an  Incompatible  change  in  the  verifier  symbol 
table  structure  since  then.  You  must  recreate  the  file  by  another  DUMPSYMBOLS  operation  or 
get  an  older  verifier. 


THIS  BUILT-IN  FUNCTION  MAY  NOT  BE  QUALIFIED 

You  tried  to  follow  a built-in  function  call  by  additional  characters.  Most  built-in  functions,  such 
as  TAG  or  EOF,  may  not  be  qualified  by  de-ref erencing,  record  fields,  subscripts,  or  union 
selection. 


THIS  ITEM  MAY  NOT  BE  USED  IN  RULEFILES 

Currently  not  used,  it  may  be  adopted  when  type  checking  is  extended  In  rutefiles. 


THIS  PROCEDURE  NOT  FOUND  ON  YOUR  SYMBOL  TABLE  FILE 

The  LOADSYMBOLS  operation  has  gone  through  the  entire  symbol  table  file  you  gave  it  and 
did  not  find  the  environment  of  the  procedure  or  function  you  specified. 


TOO  MANY  CONDITIONS  IN  THIS  CLASS-CHANGE  CVS 

The  maximum  number  of  condition  variables  which  may  appear  within  any  class  Is  determined 
by  the  value  of  a constant  named  CVS  which  must  appear  in  your  program.  CVS  did,  in  fact, 
appear,  but  you  tried  to  declare  a class  containing  more  condition  variables  (concurrent  version 
only). 
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TYPE  ERROR  IN  DATA  TRIPLE 

In  a data  triple  appearing  in  a program  assertion,  the  second  entity  of  the  data  triple  must  be  of  a 
correct  type  to  qualify  the  first  entity.  The  third  entity  must  be  of  a type  which  can  be  stored  in 
an  element  of  the  first  entity. 


UNDEFINED  OR  UNKNOWN  RECORD  FIELD 

You  tried  to  qualify  an  entity  of  record  type  with  a record  field  which  did  not  appear  In  the 
declaration  of  that  type. 


UNDEFINED  OR  UNKNOWN  UNION  TYPE  FIELD 

You  tried  to  qualify  an  entity  of  union  type  with  a union  field  which  did  not  appear  In  the 
declaration  of  that  type. 


UNION  FIELD  MODIFIES  ENTITY  NOT  OF  TYPE  UNION 
You  tried  to  modify  an  expression  not  of  union  type  with  a union  field. 


UNKNOWN  ERROR  MESSAGE  — PARSER  OR  VCG  ERROR 

An  attempt  was  made  to  emit  an  error  message  from  within  the  parser  or  VCG.  However,  that 
message  did  not  exist  on  the  error  message  file.  Please  let  someone  who  fixes  things  know! 


VARIABLE  IN  WITH-STMT  NOT  OF  TYPE  RECORD 


The  expressions  following  the  keyword  WITH  must  each  evaluate  to  be  a variabie  of  type 
RECORD. 


VARIABLE  MUST  APPEAR  IN  GLOBAL  STATEMENT 

Within  a procedure,  you  tried  to  reference  a global  variable  which  did  not  appear  in  a GLOBAL 
statement.  Globals  may  be  referenced  within  functions  without  appearing  in  a GLOBAL 
statement;  Indeed  this  statement  is  prohibited  within  functions.  See  the  next  error  for  further 
discussion. 
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VARIABLE  MUST  APPEAR  IN  GLOBAL  STATEMENT  PRECEDED  BY  VAR 

You  have  tried  to  change  the  value  of  a global  variable.  When  you  do  this  In  a procedure,  you 
must  put  the  name  of  the  variable  (or  reference  class,  for  pointer  changes)  into  a GLOBAL 
statement  VAR  list.  The  VAR  list  is  necessary  only  when  values  are  changed;  not  merely 
referenced.  If  the  global  is  merely  referenced,  it  need  not  be  preceded  by  VAR  (and  will  simplify 
proof  problems  for  you  If  it  Isn’t).  GLOBAL  statements  are  not  permitted  In  functions;  in  that 
case  you  may  have  to  convert  the  function  Into  a procedure  which  returns  its  value  as  a VAR 
parameter. 

A global  variable  appearing  in  an  INITIAL  statement  must  also  appear  in  a GLOBAL 
statement.  The  assumption  is  that  changing  the  value  of  the  global  is  intended;  If  the  global  is 
not  changed,  merely  use  the  global  name  in  assertions  and  drop  the  INITIAL  statement. 

Note  that  reference  classes  of  pointer  types  may  be  globals,  and  thus  may  have  to  appear  in  the 
GLOBAL  statement. 


VAR  PARAMETER  MAY  NOT  HAVE  EXPRESSION  PASSED  TO  IT 

You  have  tried  to  pass  an  expression  to  a procedure  In  a position  where  a VAR  parameter  was 
declared.  This  is  not  permitted,  as  it  is  not  defined  what  it  means  to  store  into  such  an  entity  In 
Pascal.  You  can  pass  an  expression  to  a non-VAR  parameter,  but  of  course,  such  expression  will 
be  strictly  an  Input  value  to  the  procedure.  Note  also  that  GLOBAL  statements  are  not  permitted 
in  functions,  which  may  not  have  side  effects.  Thus,  getting  this  syntax  error  within  a function 
can  require  re-writing  the  function  as  a procedure. 


VISIBLE  BASE  TYPE  NOT  DEFINED  WITHIN  MODULE 

A type  name  appearing  in  a BASETYPE  statement  must  be  fully  specified  within  the  module.  It 
must  appear  in  a normal  TYPE  statement  therein. 


WAIT.FOR  STMT  REQUIRES  CONDITION  VAR  AS  PARAMETER 

To  use  a wait_for  statement,  there  can  be  only  one  parameter.  It  must  be  declared  as  a condition 
variable  within  the  class  (concurrent  version  only). 


WAIT_FOR  STMT  REQUIRES  APPROPRIATE  DECLARATION  WITHIN  SCHEDULER 

To  use  a wait,  for  statement,  there  must  be  a scheduler  containing  a procedure  named  wait_for. 
Further,  that  scheduler  must  have  exactly  two  parameters:  the  first  of  type  CVLINK,  the  second 
of  type  SCHEDPROCNAME  (concurrent  version  only). 
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Notation 

x_0  is  a fresh  identifier 
•t  is  a reference  class 


C.i  Assertion  statements 


ASSERT 

P=L,  L=R 
PfASSERT  L}R 

ASSUME 

PaQsR 

P{ASSUME  Q}R 


COMMENT 

P 3 (Q,a  QpR) 

PJCOMMENT  QJR 

C.2  Basic  executable  statements 

ASSIGNMENT 

PI?  {x.-e}  P 

x_0-<x,.f,e>  a P|*  0 {x.f:»e}  P 
a_0»<a,[i],e>  3 P|*  g {a[l)>e}  P 
•t_0-<«t,cx3,e>  3 PC[  o {xt:-e}  P 


(where  x is  an  identifier) 
(where  x is  a record) 
(where  a is  an  array) 
(where  xt  has  type  t) 
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CASE 

for  I- 1 n,  QJASSUME  c-e,;  S,}R. 

c-e,  v ...  v c-en 

Q{CASE  c OF  e | :S | ; . . . ; en:Sn)R 

The  precondition  c-e | v . . . v c«en  is  omitted  if  c has  a subrange  type  containing  only  e, 


CONDITIONAL 

QJASSUME  L;  B}R,  Q{ASSUME  -L;  C}R 
QJIF  L THEN  B ELSE  C}R 

GOTO  and  Labels 

The  verifier  does  not  permit  a block  to  be  exited  by  a non-local  GOTO.  The  other  restriction  is 
that  every  closed  path  formed  by  GOTOs  and  labels  must  contain  an  ASSERT  statement.  Each 
path  through  a labelled  statement  produces  a separate  verification  condition.  The  rule  used  by 
the  verifier  constructs  an  assertion  at  each  label.  In  the  general  case,  it  is  somewhat  complicated. 
However,  if  a label  is  at  an  ASSERT  statement,  the  rule  for  GOTO  is 

P:>RJ 

P {GOTO  j)  Q. 

where  the  statement  at  label  J is  ASSERT  Rj. 


NEW 

There  are  two  axioms  for  the  NEW(x)  statement.  The  first  axiom  applies  If  x is  an  identifier. 

Otherwise,  the  second  axiom  is  used. 

« f 

-PO|NTER_TO(x_0,#t)AxJH«NIL  s QC|ux_olx_0  {new<x»  Q, 

QJNEW(s.O);  s:-s_0}R 
QJNEW(s)}R 
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REPEAT 

REPEAT  statements  are  translated  into  equivalent  WHILE  statements.  As  part  of  this 
translation,  labels  appearing  in  the  body  are  automatically  renamed. 

WHILE 

IaL{B}I,  P = I a V8b(Ia-L  a Q) 

P{INVARIANT  I WHILE  L DO  B}Q, 
where  SB  is  the  set  of  variables  changed  in  B. 


WITH 

WITH  statements  are  eliminated  by  translation. 


C.S  Procedures  and  functions 


PROCEDURES 


A Procedure  declaration  has  the  form: 

PROCEDURE  p(U;  VAR  V) 
GLOBAL  (G;  VAR  H); 
INITIAL  X-XO; 

ENTRY  l(U,G,V,H); 

EXIT  O(U,G,V,H,X0); 
BEGIN 
body 
END; 


where 


U is  the  set  of  formal  value  parameters 

V is  the  set  of  formal  variable  parameters 

G is  the  set  of  unchanged  global  variables 

H is  the  set  of  changed  global  variables 

XO  is  a set  of  logical  variables  that  may  appear  in  assertions. 

T wo  rules  are  used  to  define  the  semantics  of  procedures:  The  procedure  declaration  rule  is  used 
to  check  the  consistency  of  the  assertions  in  the  declaration.  The  procedure  call  rule  Is  used  to 
check  the  consistency  of  programs  that  call  p. 
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There  is  a slight  complication  In  the  declaration  rule  concerning  value  parameters  whose  values 
can  be  changed  by  the  body.  If  a procedure  q calls  p with  a value  parameter,  p operates  on  a 
copy  of  the  value,  so  if  p changes  the  value  of  its  parameter,  the  change  is  not  visible  to  q.  In  the 
procedure  declaration  rule,  this  behavior  Is  modelled  by  requiring  the  exit  assertion  to  refer  to  the 
initial  values  of  value  parameters,  before  execution  of  the  body.  The  value  parameters  U are 
divided  into  the  subsets  Uv  of  variables  that  can  be  changed  In  the  body,  and  Uc,  variables  that 

remain  constant.  New  variables  U0V  are  introduced  to  stand  for  the  initial  values  of  value 
parameters  that  can  be  changed  in  the  body.  Occurrences  of  variables  In  Uv  in  the  exit  assertion 
are  replaced  by  the  new  variable  in  U0V,  to  Insure  that  the  exit  assertion  refers  to  only  the  initial 
value. 

The  declaration  rule  checks  the  consistency  of  a procedure  with  its  ENTRY  and  EXIT  assertions 
by  proving  the  formula 

1(Uc,Uv,G.V,H)aX-X0aUv«U0v  {body}  O<Uc,U0v,G.V,H). 

In  the  procedure  call  rule  below,  A is  the  set  of  actual  value  parameters,  and  B is  the  set  of  actual 
VAR  parameters.  Each  VAR  parameter  consists  of  an  identifier,  0j,  followed  by  a possibly  empty 

sequence  of  component  selectors,  s(.  The  call  rule  introduces  new  variables  t| tn  to  save  the 

values  of  the  selector  sequences  of  the  VAR  parameters.  B_0  Is  the  set  {0| _0, ....  0n_O}  of  new 

variables  introduced  to  stand  for  the  values  of  the  VAR  parameters  after  the  procedure  call. 
Similarly,  H_0  is  a set  {h|_0 hm_0}  new  variables  for  the  VAR  globals.  The  variables 

actualsx  are  actua*  initial  values  corresponding  to  the  formal  variables  in  XO. 

The  formula  *p  asserts  that  the  final  value  of  each  variable  changed  by  the  procedure  call  is 

functionally  dependent  on  the  initial  values  of  all  the  parameters.  A new  uninterpreted  function 
symbol,  pj(A,G,B,H),  is  introduced  to  stand  for  the  final  value  of  each  VAR  parameter  and  VAR 

global. 

QJt|«-S|; . . . i tn«-sn}  (l(A,G,B,H)  A 
(O(A.G,B_0,H_0, actuals*)  a *p(A,G,B,H,B_0,H_0) 

{0|«-<0|.t|4|-O>; . . . ; 0n<-<0n.tnAi-O>;  H«-H_0}  R)) 

<l{p(A,B)}R 

where  *p(A,G,B,H,B.  0,H_0)  - 

0|_O»pj(A,G,B,H)  a ...  a 0n_O«pn(A,G,B,H) 
a h |-0-pn+  |(A,G,B,H)  a ...  A hm_0-pn4m(A,G,B,H). 
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Example:  consider  the  declaration 

PROCEDURE  p(d:mi;  VAR  e:m2;  VAR  f:m3); 

INITIAL  d-dO,f-fO; 

ENTRY  I<d.e.O; 

EXIT  O(d,e.f,d0,f0); 

BEGIN 

body 

END; 

Then  for  the  call  p(a,b[i],xt.f),  we  have: 

Q,  {t|«-jW«for(tl]>.  t2«-se/«Mcx3.f)J 
(I(a,b[i],«tcx3.f)  a 

(0(a,b.0,»t_0,a,«tcx3.f)  a b_0-p|(a,bti],«tcx3.f)  a «t_0-p2(a,b[i],«tcx3.f) 
{b«-<b,t|,b_0>;  •t*-<*t,t2,*t_0>)  R(a,b,x,«t)) 


Q,  {p(a,b[l],xt.f)}  R(a,b,x,«t). 


The  assignment  rule  reduces  the  upper  formula  to 
0,3  (I(a,bti],*tcx3.f) 

a (0(a,b_0,«t_0,a,#tcx3.f)  a b,0-p|(a,b[i],ttcx3.f)  a •t_0-p2(a,b[i]>*tcx=.f) 
a R(a,<b,ti],b_0>,x,<*t1cx3.f,«_0>))). 


FUNCTIONS 

A Function  declaration  has  the  form 

FUNCTION  f(U):m; 

ENTRY  I(U>, 

EXITO(U,f>, 

BEGIN 

body 

END; 

where  U is  the  set  of  formal  value  parameters. 


The  body  contains  assignment  statements  of  the  form  f:-e,  which  assign  a value  to  the  function. 
Occurrences  of  f as  a term  in  the  exit  assertion  O are  interpreted  to  stand  for  the  value  of  the 
function.  When  f appears  as  a function  sign  in  O,  it  has  its  usual  interpretation  ~ the  function  f. 


C-5 


I 


Appendix  C 


Verification  Condition  Generator 


The  system  checks  the  consistency  of  a function  declaration  by  proving  the  formula 
I(U)  {body|f=yn:.ej  0(U,f_fn). 

A new  variable,  f_fn,  is  introduced  and  the  assignment  statements  are  renamed,  to  avoid  conflicts 
between  the  two  interpretations  of  f.  The  formula  above  is  used  when  none  of  the  v arables  in  U 
can  be  changed  by  the  body.  When  this  is  not  the  case,  additional  new  variables  are  introduced 
as  in  the  procedure  declaration  rule. 

The  semantics  of  function  calls  are  not  given  by  a single  rule.  Instead,  the  semantics  of  the  * 

executable  Pascal  statements  have  been  defined  to  account  for  function  calls.  To  simplify  the 

presentation,  the  axioms  stated  elsewhere  in  this  appendix  assume  that  no  function  calls  occur  in 

executable  statements.  Thus  the  actual  rules  implemented  in  the  system  are  slightly  more  complex 

than  the  ones  listed  here. 

To  indicate  the  general  approach,  consider  assignment  statements  xtll-J,  where  i and  j are 
expressions  containing  function  calls.  Let  f|(Aj) fn(An)  be  an  order  in  which  the  function 

calls  can  be  evaluated  to  execute  the  assignment,  and  let  1^(11^)  and  0^(U^,f^)  be  the  entry  and 
exit  assertions  for  f^.  Then  under  the  actual  axiom  used  in  the  system,  the  conditions  for 
assignment  are  expressed  by 

I ,(A ,)  a (O  ,(A  ,.f,<A  In<An)  a (On(An.fn(An))  o R(*x,t,]fj>)  • • • ) R. 
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